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NUCLEOTIDE SEQUENCES ENCODING AN INSECTICIDAL PROTEIN COMPLEX FROM SERRATIA 



applicants for DO/EO/US: Travis Robert Glare, Mark Robin Holmes Hurst, Trevor A nthony Jackson 
Applicant herewith submitsto the United States Designated/Elected Office (DO/EO/US) the following items and other information: 



1- [^] This is a FIRST submission of items concerning a filing under 35 U.S.C. §371. 

2. Q This is a SECOND OR SUBSEQUENT submission of items concerning a filing under 35 U.S.C. §371 . 

3 " [xl This ex P ress request to begin national examination procedures (35 U.S.C. 5371(f)) at any time rather than delay examination until 
the expiration of the applicable time limit set in 35 U.S.C. §371 (b) and PCT Articles 22 and 39(1). 



4. 

5. 



[x] A P r °per Demand for International Preliminary Examination was made by the i 9th month from the earliest claimed priority date. 
A copy of the International Application as filed (35 U.S.C. §371 (c)(2)): 
a- (2 >s transmitted herewith (required only if not transmitted by the International Bureau). 

b. r^j has been transmitted by the International Bureau. 

c. r— | is not required, as the application was filed in the United States Receiving Office (RO/US). 

6. r— | A translation of the International Application into English (35 U.S.C. §371 (c)(2)). 

7. [2 Amendments to the claims of the International Application under PCT Article 19 (35 U.S.C. §371 (c)(3)): 

a. r— | are transmitted herewith (required only if not transmitted by the International Bureau). 

b. have been transmitted by the International Bureau. 

c * □ have not been mad e; however, the time limit for making such amendments has NOT expired. 

d. r~] have not been made and will not be made. 

8. f— j A translation of the amendments to the claims under PCT Article 1 9 (35 U.S.C. §371 (c)(3)). 

9. r— | An oath or declaration of the inventor(s) (35 U.S.C. §371 (c)(4)). 

10. r—j A translation of the annexes to the International Preliminary Examination Report under PCT Article 36 (35 U.S.C. §371 (c)(5)). 
Items 11 to 16 concern other documents or information included: 

11- □ An Information Disclosure Statement under 37 C.F.R. § § 1 .97 and 1 .98. 

12. r— | A DECLARATION and POWER OF ATTORNEY with claim under 35 U.S.C. § 1 1 9 for benefit of priority to Application Serial No. New 
Zealand Patent No. 337610 will be submitted 

13. ^ A FIRST preliminary amendment. 

r-] A SECOND or SUBSEQUENT preliminary amendment, 

14. r~ | A substitute specification. 

13- □ A change of power of attorney and/or address letter. 
16. r^jD Other items of information: 

A SEQUENCE LISTING and DISK copy thereof with Verified Statement. 
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I hereby certify that this paper is being deposited 
with the United States Postal "Express Mail Post 
Office to Addressee" Service under 37 C.F.R. 
§1.10 on the date indicated above and addressed 
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Commissioner for Patents 

U.S. Patent and Trademark Office 

Box Missing Parts 

P.O. Box 2327 

Arlington, VA 22202, on this date. 
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Kimila Carraway 




AMENDMENT IN RESPONSE TO NOTICE TO COMPLY WITH REQUIREMENTS FOR 
PATENT APPLICATIONS CONTAINING NUCLEOTIDE SEQUENCE AND/OR AMINO 

ACID SEQUENCE DISCLOSURES 

Box Missing Parts 

Commissioner for Patents 

U.S. Patent and Trademark Office 

P.O. Box 2327 

Arlington, VA 22202 

Dear Sir: 

Responsive to the Notice to File Missing Parts of Nonprovisional Application 
and the Raw Sequence Listing Error Report, mailed June 19, 2002, please amend 
the application as follows: 
IN THE SEQUENCE LISTING: 

Please replace the sequence listing in the above-captioned application with 
the attached replacement SEQUENCE LISTING. A disk copy of the SEQUENCE 
LISTING accompanies this response. 

REMARKS 

A check for the fee for a one month extension of time accompanies this 
response. The Commissioner is authorized to charge any additional fee that may 
be due in connection with this paper or with this application during its entire 
pendency may be charged to Deposit Account No. 50-1213. If a Petition for 
extension of time is needed, this paper is to be considered such Petition. 
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Glare et al. 

AMENDMENT IN RESPONSE TO NOTICE TO COMPLY 

Attached herewith is a copy of the Notice to File Missing Parts of a 
Nonprovisional Application mailed June 19, 2002 and the Raw Sequence Listing 
Error Report, paper and disk copies of the replacement Sequence Listing, and a 
Verified Statement that the content of the paper and computer readable copies are 
the same. 

The replacement Sequence Listing differs from the Sequence Listing as 
originally filed in that the replacement Sequence Listing is prepared in FastSEQ for 
Windows Version 4.0 and reflects corrections made in response to the Raw 
Sequence Listing Error Report, as follows: 

The General Information section has been amended to include the 
application number. 

In SEQ ID NO. 1, an inadvertently added amino acid number under stop 
codon had been deleted, subsequent amino acid numbers have been adjusted and 
numbers indicating the position of the amino acids have been realigned. 

In SEQ ID NO. 5, the numbers indicating the position of the amino acids 
have been realigned. 

These corrections are formal and responsive to the Raw Sequence Listing 
Error Report and the Notice to File Missing Parts mailed June 19, 2002, and thus 
no new matter has been added. 
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PATENT APPLICATION 
Attorney Docket No. 24747-1 104US 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Applicant: Glare eta/. 

Docket No.: 24747-1 104US 

Filed: March 1, 2002 

For: NUCLEOTIDE SEQUENCES ENCODING AN INSECTICIDAL 

PROTEIN COMPLEX FROM SERRATIA 

VERIFIED STATEMENT PURSUANT TO 37 § C.F.R. 1.821(f) 

I, Megha Bhurnralkar, the undersigned, a Patent Scientific Advisor, 
in the patent practice group of Stephanie Seidman, Esq., declare that I 
personally prepared the computer-readable copy of the Sequence Listing set 
forth in above-entitled Application. The computer-readable file is titled 
1 104SEQ.US2 on the disk provided herewith. 

I further declare that the computer-readable form of the SEQUENCE 
LISTING is identical to the written form of the replacement sequence listing and 
that the sequence listing does not contain matter that goes beyond the scope of 
the disclosure contained in the above-identified Application. 

I further declare that all statements made herein of my own 
knowledge are true and that all statements made on information and belief are 
believed to be true; and further that these statements were made with the 
knowledge that willful false statements and the like so made are punishable by 
fine or imprisonment, or both, under Section 1001 of Title 18 of the United 
States Code and that such willful false statements may jeopardize the validity of 
the application or any patent issued thereon. 

Dated at San Diego, California this 1st day of August, 2002. 



Megha Bhurnralkar 
Patent Scientific Advisor to 
Stephanie L. Seidman 
Registration No. 33,779 
Attorney for Applicant 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Applicant: Glare et a/. 

National Stage of International Appln. No.: 

PCT/NZ00/00174 
Filed: 04 September 2000 

Filed: herewith 

For: NUCLEOTIDE SEQUENCES ENCODING AN 
INSECTICIDAL PROTEIN COMPLEX FROM 
SERRATIA 

Group Art Unit: unassigned 

Examiner: unassigned 

ATTACHMENT TO THE PRELIMINARY AMENDMENT 
MARKED UP PARAGRAPHS AND CLAIMS (37 CFR §1.121) 

IN THE CLAIMS 

Please amend claims 8, 15 and 34 as follows: 

8. (Amended) A purified and isolated nucleic acid molecule [as claimed in 
any one] of [claims] claim 4[through 6]. 

15. (Amended) A polypeptide resulting form the transformation or 
transfection of a host cell with a recombinant expression vector [as claimed in 
any one] of claim [claims] 12 [through 14]. 

34. (Amended) An insecticidal composition [as claimed in] of claim 32J 
or 33] wherein the composition further comprises additional pesticides!, 
including compounds known to possess herbicidal, fungicidal, insecticidal or 
nematicidal activity]. 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Applicant: Glare et al. 

National Stage of International Appln. No.: 
PCT/NZ00/00174 
Filed: 04 September 2000 

Filed: herewith 

For: NUCLEOTIDE SEQUENCES ENCODING AN 
INSECTICIDAL PROTEIN COMPLEX FROM 
SERRATIA 

Group Art Unit: unassigned 

Examiner: unassigned 

PRELIMINARY AMENDMENT 

BOX PCT 

Commissioner for Patents 
Washington, D.C. 20231 

Dear Sir: 

Preliminary to the examination of the above-captioned application, please 
amend the application as follows: 
IN THE CLAIMS: 

Please add claims 42-48 as follows: 

42. (New) A purified and isolated nucleic acid molecule of claim 5. 

43. (New) A purified and isolated nucleic acid molecule of claim 6. 

44. (New) An insecticidal composition of claim 33, wherein the 
composition further comprises additional pesticides. 

45. (New) The insecticidal composition of claim 34, wherein an 
additional pesticide comprises a compound that has herbicidal, fungicidal, 
insecticidal or nematicidal activity. 

46. (New) The insecticidal composition of claim 44, wherein an 
additional pesticide comprises a compound that has herbicidal, fungicidal, 
insecticidal or nematicidal activity. 

47. (New) A polypeptide resulting form the transformation or 
transfection of a host cell with a recombinant expression vector of claim 13. 
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National Stage of International Appln. No.: PCT/NZ00/001 74 
GLARE ef a/. 

PRELIMINARY AMENDMENT 

48. (New) A polypeptide resulting form the transformation or 
transfection of a host cell with a recombinant expression vector of claim 14. 
Please replace claims 8, 15 and 34 with amended claims 8, 15 and 34 as 
follows: 

8. (Amended) A purified and isolated nucleic acid molecule of claim 4. 

15. (Amended) A polypeptide resulting form the transformation or 
transfection of a host cell with a recombinant expression vector of claim 12. 

34. (Amended) An insecticidal composition of claim 32, wherein the 
composition further comprises additional pesticides. 
IN THE SPECIFICATION 

Between the Title and "Technical Field", on page 1 of the specification, 

insert: 

—This application is the National Stage of International Application. No. 
PCT/NZ00/001 74, filed 04 September 2000. Benefit of priority under 35 
U.S.C. §365(b) to New Zealand application no. 337610, filed 02 September 
1999 is claimed herein.— 

REMARKS 

Any fees that may be due in connection with filing this paper or this 
application during its pendency may be charged to Deposit Account No. 50- 
1213. 

Claims 1-48 are presently pending. The claims are amended and new 
claims 42-48 added herein to delete multiple dependencies. The specification is 
amended to reflect the priority claim. Therefore, no new matter has been added 
nor have any amendments that alter the scope of the claims been introduced. 

It is respectfully requested that any references of record in the 
International stage of prosecution of this application be made of record in this 
application. 

Included as an attachment is a marked-up version of the amended claims 
pursuant to 37 C.F.R. §1.121. 
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National Stage of International Appln. No.: PCT/NZ00/00174 
GLARE et al. 

PRELIMINARY AMENDMENT 



In view of the above amendments and remarks, reconsideration and 
allowance of the application are respectfully requested. 
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WO 01/16305 
NUCLEOTIDE SE QIJENCES 

Technical Field 

The present invention concerns novel nucleotide sequences encoding insecticidal proteins 
from the Enterobacteriaceae, Serratia entomophila and Serratia proteamaculans, and the 
5 use of said nucleotide sequences and insecticidal proteins. 

BACKGROUND ART 

Some Serratia entomophila and Serratia proteamaculans strains in New Zealand are 
known to cause a disease in the major scarab pest, Costelytra zealandica (New Zealand 
grass grub). The disease was first discovered and described by Trought and Jackson (1982) 
10 and was later named amber disease after the distinctive colour of affected insects (Stucki et 
al. 1984). One species capable of causing the disease, Serratia entomophila, was 
developed into a commercially-available product ("Invade") in 1989. 

The disease is highly host specific, only know to infect a single indigenous species of New 
Zealand scarab larva. The disease appears unique among insects and results not from rapid 
15 invasion of the haemocoel, but from a slow colonisation of the gut. The disease has a 
distinct phenotypic progression, with infected hosts ceasing feeding within 2-5 days of 
ingesting pathogenic cells. The normally black gut clears around this time (Jackson et al. 
1993) and the levels of the major gut digestive enzymes (trypsin and so forth) decreases 
sharply (Jackson, 1995). The clearance of the gut results in a characteristic amber colour of 
20 the infected hosts. The larvae may remain in this state for a prolonged period (1-3 months) 
before bacteria eventually invade the haemocoel, causing rapid death. 

The finding of a plasmid that apparently encoded the disease was reported in Glare et al. 
(1993) by showing a correlation between pADAP presence and disease occurrence in 
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WO 01/16305 PCT/NZ00/00174 
bacterial strains. This was further confirmed by Glare et al. (1996) who showed that 

transfer of the plasmid from pathogenic to non-pathogenic strains resulted in a change to 

pathogenic. 

Grkovic et al. (1995) showed that disruption of the plasmid by transposon insertion could 
5 alter pathogenicity without fully defining the area containing the gene cassette. By marker 
exchange, they showed that a 10.5kb HindUL (pGLA20) construct from pADAP encoded 
some functions of amber disease. However, the clone did not contain all disease encoding 
plasmid-borne regions. 

Another region involved in amber disease encoding was located by Nunez- Valdez and 
10 Mahanty (1996). They located a locus, amb2, by transposon mutagensis and searching a 
cosrnid genomic library. This region was chromosomally located and was involved in 
antifeeding in the larvae of Costelytra zealandica. However, the current applicant's 
research has demonstrated that the amb2 region is located on pADAP remote from the 
virulence gene and is probably regulatory in function. 

15 Insecticidal toxins which share some protein homology to the Serratia insecticidal proteins 
of the present invention have been recently discovered (PCT/US96/18803; 
PCT/US97/07657) by a group at Wisconsin University (Blackburn et al. 1998; Bowen et al. 
1998; Bowen and Ensign 1998). These insecticidal toxins are produced from a gene region 
in Photorhabdus luminescens which resembles the Serratia virulence region in the 

20 clustering of the genes and at the protein level, but has very little DNA homology with the 
Serratia genes. They have shown high molecular weight proteins from Photorhabdus 
luminescens are insecticidal to a number of insects from different orders. The lack of DNA 
homology over the majority of the region, as opposed to protein homology, between the 
Serratia genes and Photorhabdus genes suggests that these proteins have evolved as a 

25 result of convergent evolution leading to the formation of a distinct protein family with a 
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common function. 

The present applicant has now found that three regions of the pADAP plasmid are required 
for full insecticidal function. Sequence analysis of these three regions has shown that the 
present applicant has isolated and identified a novel toxin from Serratia species that 
5 belongs to a new family of insecticidal toxins. It is broadly to this toxin that the present 
invention is directed. 

Disclosure of Invention 

According to a first aspect of the present invention, there is provided an isolated nucleic 
acid molecule comprising a nucleotide sequence of SEQ ID NO: 1 which encodes an 
10 insecticidal protein complex, or a functional fragment, neutral mutation, or homolog 
thereof which have at least 75% nucleic acid homology to SEQ ID NO: 1 and are capable 
of hybridising with said nucleic acid molecule under stringent hybridisation conditions. 

The invention also provides an isolated nucleic acid molecule comprising the nucleotide 
sequence 1995-18937 of SEQ ID NO: 1 which encodes an insecticidal protein complex, or 
15 a functional fragment, neutral mutation, or homolog thereof capable of hybridising with 
said nucleic acid molecule under standard hybridisation conditions. 

The invention also provides an isolated nucleic acid molecule comprising one or more of 
the nucleotide sequences 2411-9547, 9589-13883 or 14546-17467 of SEQ ID NO: 1 which 
encode insecticidal proteins, or a functional fragment, neutral mutation, or homolog thereof 
20 capable of hybridising with said nucleic acid molecule under standard hybridisation 
conditions. 

Preferably the nucleic acid molecule comprises all of nucleotide sequences 2411-9547, 
9598-13884 and 14546-17467 of SEQ ID NO: 1. 
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The invention further relates to an isolated nucleic acid molecule comprising a sequence of 
SEQ ID NO: 1, nucleotides 1955-18937 of SEQ ID NO: 1 or one or more of nucleotides 
241 1-9547, 9598-13884 or 14546-17467 of SEQ ED NO: 1, operably linked to at least one 
further nucleotide sequence which encode an insecticidal protein. For example, the at least 
one further nucleotide sequence may be the nucleotide sequence which codes for the 
Bacillus delta endo toxins, vegatative insecticidal proteins (vips), cholesterol oxidases, 
Clostridium bifermentens mosquitocidal toxins and/or Photorhabadus luminescens toxins 
and so forth. 

The nucleic acid molecule may comprise DNA, cDNA or RNA. 

Preferably said fragment, neutral mutation or homolog thereof is capable of hybridising to 
said nucleic acid molecule under stringent hybridisation conditions. 

The invention further relates to nucleic acid molecules which hybridise to the nucleotide 
sequence of SEQ ID NO: 1, or nucleotides 1955-18937, 241 1-9547, 9598-13884 or 14546- 
17467 of SEQ ID NO: 1 if there is at least 75% or greater identity between the sequences. 

The nucleic acid molecule may be isolated from Serratia entomophila or Serratia 
proteamaculans strains. 

Also provided by the present invention are recombinant expression vectors containing the 
nucleic acid molecule of the invention and hosts transformed with the vector of the 
invention capable of expressing a polypeptide of the invention. 

The vector may be selected from any suitable natural or artificial plasmid/vector. For 
example, pUC 19 (Yannish-Perron et al. 1995), pProEX HT (GibcoBRL, Gaithersburg, 
MD, USA), pBR322 (Bolivar et al. 1977), pACYC184 (Chang et al. 1978), pLAFR3 
(Staskowicz et al. 1987), and so forth. 
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In a further aspect, the invention provides a method of producing a polypeptide of the 

invention comprising the steps of: 

(a) culturing a host cell which has been transformed or transfected with a vector as 
defined above to express the encoded polypeptide or peptide; and 

V 

5 (b) recovering the expressed polypeptide or peptide. 

An additional aspect of the present invention provides a ligand that binds to a polypeptide 
of the invention. Most usually, the ligand is an antibody or antibody binding fragment. 
Such ligands also form a part of this invention. 

According to a further aspect of the present invention there are provided probes and primers 
10 comprising a fragment of the nucleic acid molecule of the invention capable of hybridising 
under stringent conditions to a native insecticidal gene sequence. Such probes and primers 
are useful, for example, in studying the structure and function of this novel gene and for 
obtaining homologs of the gene from bacteria other than Serratia sp. 

According to a still further aspect of the present invention there is provided a polypeptide 
15 having insecticidal activity encoded by the nucleic acid molecule of the invention, or a 
functional fragment, neutral mutation or homolog thereof. 

The polypeptide may comprise the amino acid sequence of SEQ ID NO: 1 or a functional 
fragment, neutral mutation or homolog thereof. 

The polypeptide may comprise amino acids 32-5 1 1 8 of SEQ ID NO: 1 . 

20 The polypeptide may comprise at least one amino acid sequence of SEQ ID NO: 2; SEQ ID 
NO: 3; SEQ ID NO: 4; SEQ ID NO: 5 or SEQ ID NO: 6. 

Preferably the polypeptide comprises amino acid sequence SEQ ID NO: 4; SEQ ID NO: 5 



5 



WO 01/16305 PCT/NZ00/00174 
and SEQ ID NO: 6. 

More preferably the polypeptide comprises all of SEQ ID NOs: 2-6. 

Conveniently, the polypeptide of the invention is obtained by expression of a DNA 
sequence coding therefore in a host cell or organism. 

5 The polypeptide may comprise the amino acid sequence of SEQ ID NO: 1 linked to at least 
one further amino acid sequence encoding an insecticidal protein. For example, the at least 
one further amino acid sequence may be the amino acid sequence which codes for Bacillus 
delta endo toxins, vegatative insecticidal proteins (vips), cholesterol oxidases, Clostridium 
bifermentens mosquitocidal toxins and/or Photorhabadus luminescents toxins etc. 

10 The invention further relates to polypeptides comprising at least 50%, preferably 60%, 
more preferably 70% and most preferably 90-95% or greater identity to SEQ ID NO: 1 . 

The polypeptide may be produced by expression of a vector comprising the nucleic acid 
molecule of the invention or a functional fragment, neutral mutation or homolog thereof, in 
a suitable host cell. 

15 According to a further aspect, there is provided an insecticidal composition comprising at 
least the polypeptide of the invention and an agriculturally acceptable carrier such as would 
be known to a person skilled in the art. More than one polypeptide of the invention can of 
course, be included in the composition. In addition, the composition may comprise one or 
more additional pesticides, for example, compounds known to possess herbicidal, 

20 fungicidal, insecticidal or nematicidal activity. 

The composition may further comprise other known insecticidally active agents, such as 
Bacillus delta endo toxins, vegatative insecticidal proteins (vips), cholesterol oxidases, 
Clostridium bifermentens mosquitocidal toxins and/or Photorhabadus luminescents toxins 



6 



WO 01/16305 PCT/NZ00/00174 
and so forth. 

V 

According to a further aspect, there is provided a method of combating pests, especially 
insects at a locus or host for the pest infested with or liable to be infested therewith, said 
method comprising applying to a locus, host and/or the pest, an effective amount of the 
5 polypeptide of the invention that has functional insecticidal activity against said pest. 

According to a further aspect the invention provides a method of inducing amber disease or 
like condition in insects comprising delivery to an insect an effective amount of the 
polypeptide of the invention that has functional insecticidal activity against said insect. 

The insect may be selected from the order comprising Coleoptera (such as the black beetle, 
10 Heteronychus arator (F.), or the black vine weevil, Otiorhynchus sulcatus (R)); 
Dictyoptera \eg. The German cockroach, Blattella germanica (L.), or the subterranean 
termite Coptotermes spp,); Diptera (eg. the housefly Musca domestica L. or the blowfly 
Lucillia cuprina (Wiedermann); Orthoptera (eg. The black field cricket Telleogryllus 
commodus (Walker) or the migratory locust Locusta migratoria L.); Hymenoptera (eg. The 
15 German wasp, Vespula germanica F.)); Hemiptera (such as the green vegetable bug Nezara , 
viridula (L.) or the green peach aphid Myzus persicae (Sulzer)) the Lepidoptera (eg. the 
tomato fruitworm, Helicoverpa armigera (Walker), or the codling moth, Laspeyresia 
pomonella (L.)). 

The insecticidal polypeptide may be delivered to the insect orally either as a solid bait 
20 matrix, as a sprayable insecticide sprayed onto a substrate upon which the insect feeds, 

V 

applied directly to the soil subsurface or as a drench or is expressed in an transgenic plant, 
bacterium, virus or fungus upon which the insect feeds, or by any other suitable method 
which would be obvious to a person skilled in the art. 

According to a further aspect, the invention provides a transgenic plant, bacterium virus or 
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fungus, incorporating in its genome, a nucleic acid molecule of the invention providing the 
plant, bacterium virus or fungus with an ability to express an effective amount of an 
insecticidal polypeptide. 

Definitions and Methods 

5 The following definitions and methods are provided to better define the present invention 
and to guide those of ordinary skill in the art in the practice of the present invention. 

Definitions of common terms in molecular biology may also be found in Lewin, Genes V, 
Oxford University Press: New York, 1994. 

The term "native" refers to a naturally-occurring nucleic acid or polypeptide, including, 
10 wild-type sequence and alleles thereof. 

A "homolog" has at least one of the biological activities of the nucleic acid or polypeptide 
of the invention and comprises at least 50-70% identical amino acid or nucleic acid 
sequence thereto, preferably 75-85% and most preferably 90-95% identical amino acid or 
nucleic acid sequence thereto. 

15 The term "neutral mutation" means a mutation, (that is - a change in the nucleotide or 
polypeptide sequence such as by deletion, substitution, inversion or insertion, any of which 
have no effect on the function of the encoded protein). 

As indicated above, also possible are variants of the polypeptide or peptide that differ from 
the native amino acid sequence by insertion, substitution or deletion of one or more amino 
20 acids. Where such a variant is desired, the nucleotide sequence of the native DNA is 
altered appropriately. This alteration can be made through elective synthesis of the DNA, 
or by modification of the native DNA by, for example, site specific or cassette mutagenesis. 
Preferably, where portions of cDN A or genomic DNA require sequence modifications, site- 
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specific primer directed mutagenesis is employed using techniques standard in the art. 

In a further aspect, the present invention consists in replicable transfer vector suitable for 
use in preparing a polypeptide of the invention. These vectors may be constructed 
according to techniques well known in the art, or may be selected from cloning vecotrs 
5 available in the art. 

The cloning vector may be selected according to the host or host cell to be used. Useful 
vectors will generally have the following characteristics: 

V 

(a) the ability to self-replicate; 

(b) the possession of a single target for any particular restriction endonuclease; and 

10 (c) desirably, carry genes for a readily selectable marker such as antibiotic resistance. 

Two major types of vector possessing these characteristics are plasmids and bacterial 
viruses (bacteriophages or phages). Presently preferred vectors include plasmids pMOS- 
Blue, pGem-T and pUC8. 

The nucleic acids of the present invention can be free in solution, or attached by 
15 conventional means to a solid support, or present in an expression vector or any other type 
of plasmid. * 

The term "isolated" means substantially separated or purified away from contaminating 
sequences in the cell or organism in which the nucleic acid naturally occurs and includes 
nucleic acids purified by standard purification techniques as well as nucleic acids prepared 
20 by recombinant technology and those chemically synthesised. 

The terms "DNA construct" means a construct incorporating the nucleic acid molecule of 
the present invention, or a fractional fragment, neutral mutation or homolog thereof in a 
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position whereby the protein coding sequence is under the control of an operably linked 
promoter capable of expression in a plant cell. Such promoters are well known in the art. 

A fragment of a nucleic acid molecule according to the present invention is a portion of the 
nucleic acid that is less than full length and comprises at least a minimum length capable of 
5 hybridising specifically with a nucleic acid molecule according to the present invention (or 
a sequence complementary thereto) under stringent conditions as defined below. A 
fragment according to the present invention has at least one of the biological activities of 
the nucleic acid or polypeptide of the present invention. 

Nucleic acid probes and primers can be prepared based on nucleic acids according to the 
10 present invention (for example, the sequence of SEQ ID NO: 1). A "probe" comprises an 
isolated nucleic acid attached to a detectable label or reporter molecule well known in the 
art. Typicaljabels include radioactive isotopes, ligands, chemiluminescent agents, and 
enzymes. 

"Primers" are short nucleic acids, preferably DNA oligonucleotides 15 nucleotides or more 
15 in length, which are annealed to a complementary target DNA strand by nucleic acid 
hybridisation to form a hybrid between the primer and the target DNA strand, then 
extended along the target DNA strand by a polymerase, preferably a DNA polymerase. 
Primer pairs can be used for amplification of a nucleic acid sequence, (for example, by the 
polymerase chain reaction (PCR) or other nucleic acid amplification methods well known 
20 in the art). PCT-primer pairs can be derived from the sequence of a nucleic acid according 
to the present invention, (for example, by using computer programs intended for that 
purpose such as Primer (Version 0.5© 1991 , Whitehead Institute for Biomedical Research, 
Cambridge, MA)). 

Methods for preparing and using probes and primers are described, for example, in 
25 Sambrook et al. Molecular Cloning: A Laboratory Manual, 2 nd ed, vol. 1-3, ed Sambrook 
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et al. Cold Spring Harbour Laboratory Press, Cold Spring Harbour, NY, 1989, 

Probes or primers can be free in solution or covalently or noncovalently attached to a solid 
support by standard means. 

The term "operably linked" means a first nucleic acid sequence linked to a second nucleic 
5 acid sequence when the first nucleic acid sequence is placed in a functional relationship 
with the second nucleic acid sequence. For instance, a promoter is operably linked to a 
coding sequence if the promoter affects the transcription or expression of the coding 
sequence. Generally, operably linked DNA sequences are contiguous and, where necessary 
to join two protein coding regions, in reading frame. 

10 The DNA molecules of the invention may be expressed by placing them in operable linkage 
with suitable control sequences in a replicable expression vector. Control sequences may 
include origins of replication, a promoter, enhancer and transcriptional terminator 
sequences, amongst others. The selection of the control sequence to be included in the 
expression vector is dependent on the type of host or host cell intended to be used for 

15 expressing the DNA. 

A "recombinant" nucleic acid is one that has a sequence that is not naturally occurring or 
has a sequence that is made by an artificial combination of two otherwise separated 
segments of sequence. This artificial combination is often accomplished by chemical 
synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic 
20 acids (for example, by genetic engineering techniques). 

Techniques for nucleic acid manipulation are described generally in, for example, 
Sambrook et al. (1989). 

Large amounts of a nucleic acid according to the present invention can be produced by 

V 

recombinant means well known in the art or by chemical synthesis. 
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Natural or synthetic nucleic acids according to the present invention can be incorporated 

into recombinant nucleic acid constructs, typically DNA constructs, capable of introduction 

into and replication in a host cell. Usually the DNA constructs will be suitable for 

replication in a unicellular host, such as E. coli or other commonly used bacteria, but can 

5 also be introduced into yeast, mammalian, plant or other eukaryotic cells. 

Preferably, such a nucleic acid construct is a vector comprising a replication system 
recognised by the host. For the practice of the present invention, well known compositions 
and techniques for preparing and using vectors, host cells, introduction of vectors into host 
cells and so forth., are employed, as discussed, inter alia, in Sambrook et al (1989). 

10 A cell, tissue, organ, or organism into which has been introduced a foreign nucleic acid, 
such as a recombinant vector, is considered "transformed" or "transgenic". The DNA 
construct comprising a DNA sequence according to the present invention that is present in 
a transgenic host cell, particularly a transgenic plant, is referred to as a "transgene". The 
term "transgenic" or "transformed" when referring to a cell or organism, also includes; 

15 (1) progeny of the cell or organism, and 

(2) plants produced from a breeding program employing such a "transgenic" plant as a 
parent in a cross and exhibiting an altered phenotype resulting from the presence of 
the recombinant DNA construct. 

Generally, procaryotic, yeast, insect, or mammalian cells are useful hosts. Also included 
20 within the term hosts are plasmid vectors. Suitable procaryotic hosts include E. coli, 
Bacillus species and various species of Pseudomonas. Commonly used promoters such as 
(3-lactamase (penicillinase) and lactose (lac) promoter systems are all well known in the art. 
Any available promoter system compatible with the host of choice can be used. Vectors 
used in yeast are also available and well known. A suitable example is the 2 micron origin 
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of replication plasmid. 

V 

Similarly, vectors for use in mammalian cells are also well known. Such vectors include 
well known derivatives of SV-40, adenovirus, retrovirus-derived DNA sequences, Herpes 
simplex virus, and vectors derived from a combination of plasmid and phage DNA. 

5 Further eucaryotic expression vectors are known in the art (for example in P.J. Southern 
and P. Berg, J. Mol Appl. Genet. 1 327-341 (1982); S. Subramani et ah, Mol Cell Biol. 7, 
854-864 (1981); R.J. Kaufmann and P. A. Sharp, "Amplification and Expression of 
Sequences Cotransfected with a Modular Dihydrofolate Reducase Complementary DNA 
Gene, J. Mol. Biol. 159, 601-621 (1982); R.J. Kaufmann and P,A. Sharp, Mol Cell Biol. 
10 759, 601-664 (1982); S.I Scahill et al., "Expressions and Characterisation of the Product 
of a Human Immune Interferon DNA Gene in Chinese Hamster Ovary Cells" Proc. Natl. 
Acad. Sci. USA. 80, 4654-4659 (1983); G. Urlaub and LA. Chasin, Proc. Natl. Acad. Sci. 
USA. 77, 4216-4220, (1980). 

The expression vectors useful in the present invention contain at least one expression 
15 control sequence that is operatively linked to the DNA sequence or fragment to be 
expressed. The control sequence is inserted in the vector in order to control and to regulate 
the expression of the cloned DNA sequence. Examples of useful expression control 
sequences are the lac system, the trp system, the tac system, the trc system, major operator 
and promoter regions of phage lambda, the glycolytic promoters of yeast acid phosphatase, 
20 (for example, Pho5), the promoters of the yeast alpha-mating factors, and promoters 
derived frorn polyoma, adenovirus, retrovirus, and simian virus (for example, the early and 
late promoters of SV-40), and other sequences known to control the expression of genes of 
prokaryotic and eucaryotic cells and their viruses or combinations thereof. 

In the construction of a vector it is also an advantage to be able to distinguish the vector 
25 incorporating the foreign DNA from unmodified vectors by a convenient and rapid assay. 
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Reporter systems useful in such assays include reported genes, and other detectable labels 

which produce measurable colour changes, antibiotic resistance and the like. In one 

preferred vector, the P-galactosidase reporter gene is used, which gene is detectable by 

clones exhibiting a blue phenotype on X-gal plates. This facilitates selection. In one 

5 embodiment, the p-galactosidase gene may be replaced by a polyhedrin-encoding gene; 

which gene is detectable by clones exhibiting a white phenotype when stained with X-gal. 

This blue-white colour selection can serve as a useful marker for detecting recombinant 
vectors. 

Once selected, the vectors may be isolated from the culture using routine procedures such 
10 as freeze-thaw extraction followed by purification. 

For expression, vectors containing the DNA of the invention to be expressed and control 
signals are inserted or transformed into a host or host cell. Some useful expression host 
cells include well-known prokaryotic and eucaryotic cells. Some suitable prokaryotic hosts 
include, for example, E. coli, such as E. coli, S G-936, E. coli HB 101, E. coli W31 10, E. 
15 coli XI 776, E. coli, X2282, E. coli DHT and E, coli MR01, Pseudomonas\ Bacillus A such 
as Bacillus subtilis and Streptomyces. Suitable eucaryotic cells include yeast and other 
fungi, insect, animal cells, such as COS cells and CHO cells, human cells and plant cells in 
tissue culture. 

Depending on the host used, transformation is performed according to standard techniques 
appropriate to such cells. For prokaryotes or other cells that contain substantial cell walls, 
the calcium treatment process (Cohen, S N Proceedings, National Academy of Science, 
USA 69 21 10 (1972)) may be employed. For mammalian cells without such cell walls the 
calcium phosphate precipitation method of Graeme and Van Der Eb, Virology 52:546 
(1978) is preferred. Transformations into plants may be carried out using A grobacterium 
tumefaciens (Shaw et al., Gene 23:3 15 (1983)) or into yeast according to the method of Van 
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Solingen et a). J. Bact. 130:946 (1977) and Hsiao et al. Proceedings, National Academy of 

Science, 76:3829(1979). 

Upon transformation of the selected host with an appropriate vector the polypeptide, or 
peptide encoded can be produced, often in the form of fusion protein, by culturing the host 
5 cells. The polypeptide, or peptide, of the invention may be detected by rapid assays as 
indicated above. The polypeptide, or peptide, is then recovered and purified as necessary. 
Recovery and purification can be achieved using any of those procedures known in the art, 
for example by absorption onto the elution from an anion exchange resin. This method of 

v 

producing a polypeptide, or peptide, of the invention constitutes a further aspect of the 
10 present invention. 

Host cells transformed with the vectors of the invention also form a further aspect of the 
present invention. 

Methods for chemical synthesis of nucleic acids are well known and can be performed, for 
example, on commercial automated oligonucleotide synthesisers. 

15 The term "stringent conditions" is functionally defined with regard to the hybridisation of a 
nucleic acid probe to a target nucleic acid (for example, to a particular nucleic acid 
sequence of interest) by the hybridisation procedure discussed in Sambrook et al. (1989) at 
9.52-9.55 and 9.56-9.58. 

Regarding the amplification of a target nucleic acid sequence (for example,, by PCR) using 
20 a particular amplification primer pair, stringent conditions are conditions that permit the 
primer pair to hybridise only to the target nucleic acid sequence to which a primer having 
the corresponding wild type sequence (or its complement) would bind. 

Nucleic acid hybridisation is affected by such conditions as salt concentration, temperature, 
or organic solvents, in addition to the base composition, length of the complementary 
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strands, and the number of nucleotide base mismatches between the hybridising nucleic 

acids, as wi]] x be readily appreciated by those skilled in the art. 

When referring to a probe or primer, the term "specific for (a target sequence)'' indicates" 
that the probe or primer hybridises under stringent conditions only to the target sequence in 
5 a given sample comprising the target sequence. 

The term "protein (or polypeptide)" refers to a protein encoded by the nucleic acid 
molecule of the invention including fragments, mutations and homologs having the same 
biological activity (for example, insecticidal activity). The polypeptide of the invention can 
be isolated from a natural source,. produced by the expression of a recombinant nucleic acid 
10 molecule or be chemically synthesised. 

Peptides having substantial sequence identity to the above-mentioned peptides can also be 
employed in preferred embodiments. Here, "substantial sequence identity" means that two 
peptide sequences, when optimally aligned, such as by the programs GAP or BESTFIT 
using default gap weights, share at least 80% sequence identity, preferably at least 90% 
15 sequence identity, more preferably at least 95% sequence identity or more. Preferably, 
residue positions that are not identical differ by conservative amino acid substitutions. For 
example, the substitution of amino acids having similar chemical properties such as charge 
or polarity are not likely to effect the properties of a protein. Examples include glutamine 
for asparagine, or glutamic acid for aspartic acid. 

20 Brief Description of Drawings 

The invention will be further defined by reference to the specification and the following 
examples and figures herein. 

Figure 1 shows restriction maps of clones used to isolate the pathogenic region and 
maps of the two pathogenic variants pMH32 and pMH41, in accordance 
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with a preferred embodiment of the present invention; and 

Figure 2 shows deletion derivatives used in the study, restriction maps of the mutated 

constructs and recombinants, the phenotype of each mutation, the schematic 
diagram of the sequenced region, and a nucleotide sequence in accordance 
5 v with a preferred embodiment of the present invention; and 

Figure 3 shows hydrophobicity plots of SepC and its closest homologue TccC, in 
accordance with a preferred embodiment of the present invention; and 

Figure 4 shows the comparison of protein sequences of the SepA and P. luminescens 

toxins, TcdA, TcaB and TccB Putative RGD motif is boxed, plus the site of 
10 proteolytic cleavage is illustrated, in accordance with a preferred 

embodiment of the present invention; and 

Figure 5 shows the comparison of protein sequences of the SepC and P. luminescens 

toxin TccC, in accordance with a preferred embodiment of the present 
v invention; and 

15 Figure 6 shows the plasmid pADAP, in accordance with a preferred embodiment of 

the present invention. 

BEST MODES FOR CARRYING OUT THE INVENTION 

The invention will be further defined by reference to the specification and the following 
examples and figures herein in the ensuing description by way of example only where: 

20 Figure 1 shows restriction maps of clones used to isolate the pathogenic region and maps of 
the two pathogenic variants pMH32 and pMH41, where: 

(A) Is the pv^DAP HindUl clone pGLA-20 showing locations of the pGLA-20 mutations - 
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10,-13, and 35, which when recombined back into pADAP and bioassayed against grass 
grub, result in either a pathogenic phenotype, shown by full flag, or a healthy but non- 
feeding phenotype indicated by half filled flag. Map of pBG35 showing relative position of 
pGLA-20-35 mutation and the location of the 2.2kb EcoRi used as a probe to screen the 
5 pADAP BamHI library; and 

(B) Illustrated restriction enzyme maps of the pathogenic clones pMH32 and pMH41, area 
of deletion is indicated by A. 

E23 pBR322 vector DNA; 

■H pLAFR3 vector DNA. 

10 Restriction enzymes are abbreviated as follows: B, BamHI, Bg, Bgltt; E, EcoRI: H, HindlSI; 
and X, Xbal. 

Figure 2 shows: 

(A) Which are Mini-TniO pACYC184 based deletion derivatives used in the study. 
HMD is the pAC YC 1 84 vector, 

15 A indicates deletion + pathogenic, 
loss of pathogenicity; and 

(B) Illustrates restriction maps of the mutated constructs pBM32 and the pADK 
recombinants; and 

(C) Where the phenotype of each mutant is indicated by flags. 

20 Blocked flags indicates mutations that did not affect the disease process. 

Open flags indicate mutations that abolish disease symptoms. 
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Half-filled flags denote mutations that abolish visual disease symptoms but are 
unable to feed. 

* indicates pADK mutations obtained by Grkovic et al. (1995). 

Restriction enzymes are abbreviated as follows: B, BamHI, Bg, BglR\ E, EcoRl; H, 
5 Hindm\ and X 7 Xbal 

(D) Is a schematic diagram of the sequenced region, where: 

Mi Denotes sequenced region. 

Arrows indicate ORFs and their direction 

E£3 region homologous to spvB ... location of repeat. 

10 (E) Is a nucleotide sequence of the 5 times 12bp repeat and the palindrome. 

Restriction enzymes are abbreviated as follows: B, SamHI, Bg, BglR; E, EcoRI; H, HindUL; 
zndX.Xbal 

In Figure 3 hydrophobicity plots of SepC and its closest homologue TccC are shown. The 
scale is disproportional to size arid has a scanning window of 17 amino-acid residues. 

15 Figure 4 shows the comparison of protein sequences of the SepA and P. luminescent 
toxins, TcdA, TcaB and TccB. Putative RGD motif is boxed. The site of proteolytic 
cleavage is reported by Bo wen et al. (1998) (Residue 1933 of TcdA) is indicated by an 
arrow. 

Figure 5 shows the comparison of protein sequences of the SepC and P. luminescens toxin 
20 TccC; and Figure 6 shows the plasmid pADAP. 
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Bacterial isolates and methods of culture 

Table 1 lists bacterial isolates and plasmids used in the present invention. Bacteria were 
grown in LBbroth or on LB agar (Sambrook et al. 1989), at 37° for Escherichia coli and 
5 30°C for S. entomophila. Antibiotic concentrations used (|ig/ml) for Serratia were 
kanamycin 100, chloramphenicol 90, tetracycline 30 and for E. coli strains were kanamycin 
50, chloramphenicol 30, tetracycline 15, and ampicillin 100. 

DNA isolation and manipulations 

pADAP DNA was isolated from a 50ml overnight culture of bacteria using QIAGEN® 
10 plasmid maxi kit (Qiagen, Hilden, Germany), as per the manufacturer's instructions. 
Standard DNA techniques were carried out as described by Sambrook et al. (1989). 
Radioactive probes were made using the Amersham Megaprime DNA labeling system 
(Amersham, J3uckinghamshire, UK). Southern and colony hybridisations were performed 
as outlined in Sambrook et al. (1989). The plasmid pADAP is shown in Figure 6. 

15 pADAP BamUl library was constructed using a Sigma < Gigapack®ILKL packaging extract, 
as specified by the manufacturer (Stratagene, California, USA). 

Introduction of plasmid DNA into E. coli and S. entomophilia 

pLAFR3 based derivatives were introduced into S. entomophilia by tripartite matings on 
solid media as described previously (Finnegan & Sheratt, 1982) using the pRK20 13 helper 
20 plasmid (Figorski & Helanski, 1979). pACYC 184 and pBR322 based plasmids were 
electroporated into E, coli and S. entomophilia strains, using a Biorad Gene Pulser (2|iF, 
2.5KV, and 200 abns) (Dower et al. 1988). 
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Mutagenesis 

Transposon insertions were generated in recombinant plasmids using the raini-TnJO 
derivative 103 (kanamycin resistant) as described by Kleckner et al. (1991), Insertions 
were recombined into pADAP by transforming A1M02 (refer to Table 1) with the 
5 described construct. After growth in non-selective media, bacteria were screened for 
resistance to kanamycin and loss of the pLAFR3 tetracycline resistance marker. 

Bioassay against Costelytra zealandica larvae 

Infection of C zealandica larvae was determined by a standard bioassay where the healthy 
larvae, collected from the field, were individually fed squares of carrot which had been 

10 rolled in colonies of bacteria grown overnight on solid media (resulting in approximately 
10 s cells/carrot square). Twelve, second or third instar larvae were used for each treatment. 
Inoculated larvae were maintained at 15°C, in ice-cube trays. Larvae were left feeding on 
treated carrot for 3-4 days, then transferred to fresh trays and provided with untreated carrot 
for 10-14 days. The occurrence of gut clearance and loss of feeding was recorded every 3-4 

15 days. Strains were considered disease-causing if greater than 70% of larvae showed disease 
symptoms by day 14. Known pathogenic and non pathogenic controls were included in all 
bioassays. Typically cessation of feeding occurs within 2-3 days while clearance of the 
larvae gut may take 4-6 days. 

Recovery of bacteria from larvae 

20 To isolate bacteria from inoculated grubs, larvae were surface sterilised by submerging in 
70% methanol for 30 seconds. The larvae were then shaken in sterile DH 2 0, removed and 
individually macerated in a 1.5ml microcentrifuge tube. The macerate was serial diluted 
and plated on LB media containing antibiotics selective for the host S. entomophilia strain. 
To assess the stability of the bioassayed plasmid, colonies were patched onto a plate 
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containing antibiotics either selective for the recombinant plasmid or the S. entomophilia 

strain. Identity of plasmids in the recovered strain was checked by restriction enzyme 

profile. 

Nucleotide Sequencing 

.. . ■ 

5 A 9-kb BamHl -EcoRl fragment derived from the pBM32-8 mutation (Fig 2b) and the 8kb 
HindUl fragment of pBM32 were separately cloned into the appropriate site of the deletion 
factory plasmid pDELTAl. Deletions were generated using the Delection factory™ system 
(GEBCO BRL, MD, USA), as outlined in the manufacturers instructions. To identify the 
precise location of mxxn-TnlO mutations, the peripheral mini-7h70 BamHl sites were used 
10 in conjunction with the BamHl sites of the pathogenic region to subclone the mini-7n70 
flanking regions into either pACYC184 or pUC19. Sequences were generated using the 
mini-7>r7<9 specific primer 5 ' ATG AC A AG ATGTGT ATCC ACC3 ' (Kleckner et al. 1991). 

Plasmids for sequencing were prepared by Wizard® (Promega, Madison, USA) or Quantum 
Prep® (Bio-Rad, California, USA) miniprep kits. Sequences were determined on both 

15 strands, by using combinations of subcloned fragments, custom primers and deletion 
products derived from the deletion factory system (Gibco BRL, Madison, USA). The DNA 
was sequenced using either 33 P dCTP and the Thermosequenase cycle sequencing kit 
(Amersham, Buckinghamshire, UK), or by automated sequencing using an Applied 
Biosystem 373A or 377 autosequencer. Sequence data were assembled using SEQMAN 

20 (DNASTAR Inc., Madison, USA). ORFs were analysed by Gene Jockey. Databases at the 
National Center for Biotechnology Information were searched by using BLASTN and 
BLASTX via the www.ncbi.nlm.gov/BLAST. . Searches for DNA palindromes, repeats and 
inverted repeats were undertaken using DNAMAN (Lynnon Biosoft, Quebec, Canada). 
Protein motifs were searched using Blocks (http://www.blocks.fhcrc.org/), ExPASy 

25 (http://www.expasy.ch/), and Gene Quiz (http://columba.ebi.ac.uk:8765/gqsrv/submit). 
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The sequences determined in this study have been deposited in Gene Bank under Accession 

Number AF1335182. 



Results 

Cloning the disease encoding region from pADAP 

5 Previously, Grkovic et al. (1995) have shown taht the pADK-13 mutation can be 
complemented with the pADAP 1 1 kb HindlR fragment (pGLA-20). However, the pADK- 
10 mutation was unable to be complemented with pGLA-20. In an attempt to isolate the 
region that may complement the pADK-10 mutation the previously described pGLA-20 
derived, pADK-35 null mutation (Grkovic et al. 1995) was used as a selective marker (Fig 

10 1), to select the BglR fragment encompassing both the pADK-10 and pADK-35 mutations. 
pADK-35 DNA was isolated and digested with the restriction enzyme BglYl. The resultant 
digest was ligated into the BamHI site of bBR322 to form the construct pBG35 (containing 
12.8kb BglR - mini-ThiO fragment). pBG35 was placed separately in trans with pADK-lO 
and pGLA-20, and the resultant strains bioassayed against grass grub larvae. Results 

15 showed that pBG35 was able to complement the pADK-10 mutant, but was unable to 
induce any symptoms of amber disease when placed in trans with pGLA-20, indicating that 
there must be another region on pADAP needed to induce amber disease. 

Restriction enzyme data of pGLA-20 and pBG35 suggested that the entire pathogenic 
region may reside within one of the large BamHI fragments of pADAP. A cosmid BamHI 

20 library of pADAP was made and screened using the 2.2kb EcoRl fragment derived from 
pBG35 (Fig 1) as the probe. Several probe positive clones were isolated; all shared similar 
restriction enzyme profiles. However, one (designated pMH32) was found to be smaller, 
measuring OQly 23kb in size compared with the 33kb of the other clones (eg. pMH41; Fig 
lb). The difference between pMH32 and pMH4I was found to be a lOkb deletion at the 

25 left most end of pMH32 encompassing the one HindUI site (Fig 1). E. coli strains 
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containing pMH32 or pMH41 were bioassays against grass grub larvae and found to induce 

the full symptoms of amber disease (that is - gut clearance and antifeeding activity). 

However, about ten days after infection a proportion of grass grubs fed the E. coli strains 

were found to recover from a diseased to a healthy phenotype. 

5 The pi asm ids pMH32 and pMH41 were subsequently introduced into a S. entomophilia 
strain cured of pADAP (5.6RC) and the strains bioassayed against grass grub larvae. The 
strains gave the same disease progression as wild type and no larvae recovered, suggesting 
that the region cloned in pMH32 contained all the pathogenic determinants of pADAP. 

Effect of copy number and mini-7>ii0 insertions in pBM32 on disease-causing ability 

10 To facilitate mutagenesis and assess the effect of copy number on the disease process, the 
23kb BamHl fragment from pMH32 was cloned into the medium copy plasmid pBR322 to 
give pBM32. A bioassay comparing the ability of pMH32 and pBM32 to induce amber 
disease against grass grub was undertaken. Results showed that there were no visual 
differences in the progression of amber disease between pBM32 and pMH32. The 

15 construct pBM32 was mutated with the mini-Tn/0 transposon derivative 103, and 
insertions mapped (Fig 2b). Bioassays of E. coli strains containing plasmids of the 
resultant mutants, showed that the disease determinants were confined within a central 
16.9kb region (nucleotides 1955-18937 of SEQ ID NO: 1). 

All strains were non-pathogenic or fully pathogenic, and no partial disease phenotypes such 
20 as antifeeding, or gut clearance were noted. 

To confirm that no sequences at either end of the cloned fragment influenced the disease 
process, several deletion plasmids were made (Fig 2a). The large fragments resulting from 
cleavage of the pBM32 -4, -8, -10, -20, -23, -24 and -35 plasmids with BamHI were cloned 
into the analogues site of pACYC184. The resultant plasmids were transformed into the 



24 



WO 01/16305 PCT/N ZOO/001 74 

non-pathogenic 5. entomophilia strain 5.6RKm and assessed for pathogenicity. This 

analysis confirmed that the central 16.9kb region (Fig 2a) was sufficient to induce the 

disease. 

Effect of mmi-TnlO insertions in pADAP on disease-causing ability 

5 Grkovic et al. (1995) recombined by marker exchange the pGLA-20 based mutations - 10 
and -13 into pADAP (Fig 2a). When bioassayed, S. entomophilia strains containing either 
of these mutant plasmids caused a partial condition including cessation of feeding but not 
gut clearance or amber colouration. This was in contrast to the complete abolition of 
disease observed in pADAP-cured 5. entomophilia strains containing mutant pBM32 
10 plasmids with similar insertions. 

To determine the disease phenotype of the pBM32-based insertions in a pADAP 
background, the pBM32 based insertions were transferred into pADAP. pBM32 -1, -2, -4, 
-5, -6, -8, -9, -10, -21, -24, -30, -31 and . -35 DNA fragments containing the inserted 
transposon and flanking DNA were cloned as independent fragments into pLAFR3 and the 

15 inserts recombined back into pADAP by marker exchange (Fig 2c). The resultant 
recombinant S, entomophilia strains were checked by Southern analysis to confirm that 
recombination had occurred as expected and no pLAFR3 vector sequences were present 
(data not shown). Mutations that did not affect the disease process in pBM32 also had no 
effect when recombined back into pADAP. However, strains with the pADAP mutants that 

20 totally abolished the disease process when in the pBM32 clone caused non-feeding but not 
gut clearance of the grubs (Fig 2b, c). Hence, none of the pADAP recombinant strains 
completely abolished the disease process. This suggests that, while the 16.9kb fragment 
contains all genes required for pathogenicity, other genes contributing to the antifeeding 
effect are present on some other part of pADAP. 

25 Assessment of plasmid stability during the course of the bioassay showed that greater than 
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90% of the recombinant Serratia strains contained the clone of interest. 

Nucleotide Sequence Analysis of the pathogenic region 

The large BamHl fragment (18937 bp) derived from the pBM32-8 was sequenced on both 
strands using a combination of constructed detections, plasmid subclones and custom made 
5 primers. A total continuous sequence of 18937 bp has been deposited in Gene Bank 
(Accession Number AF135182). Structural analysis of the DNA sequence using 
DNAMAN showed that there was a 12-bp sequence repealed five times between positions 
683 and 743. The repeat is flanked by an upstream 13 base pair palindrome (669-682-bp), 
and a degenerate 34-bp downstream palindrome (765-799-bp)(Fig 2d,e). 

10 Translation of the nucleotide sequence revealed nine significant open reading frames 
(ORF's). These together with their putative ribosomal binding sites and their base 
composition are listed in Table 2. Eight of the ORF's were oriented in the same direction 
and the other two in the opposite direction (Fig 2d). Sequence similarity searches showed 
that the deduced products of seven of these ORF's shared similarity with known proteins 

15 (Table 3). Products of three of the ORF's showed similarity to different protein 
components of insecticidal toxins of Photorhabadus luminescents (Bowen et ah 1998). 

These ORF's have been designated sep. (sepA y sepB and sepC) for Serratia entomophilia 
pathogenicity. 

Similarities of deduced amino-acid sequences to proteins in current database 

20 Results of database searches for homologues proteins are listed in Table 4. 

With reference to Fig 2d and Table 4, the following protein similarities were identified: 

The protein product of sepA, had high similarity to the P. luminescents insecticidal toxin 
complex protein TcbA, Ted A, TcaB and TccB. These proteins shared three significant 
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regions of predicted amino-acid similarity, at the amino-terminal region (Sep A amino-acid 

residues (121-178), a central region (SepA amino-acid residues 960-1083) and, with 

greatest similarity, at the carboxyi terminus (SepA amino-acid residues 1630-2376) Fig. 4). 

However, there was little amino acid conservation around the putative proteolytic cleavage 

5 site of TcaB, TcbA and TcdA identified by Bowen et al. (1998). SepA also contained a 

region (residues 1057-1345) with weak similarity to the Clostridium bifermentans 

mosquitocidal toxin cbm71 (Barloy et al., 1996). 

SepB and the P. luminescens insecticidal toxin complex protein TcaC shared similarity 
throughout their length, and both SepA and TcaC showed high amino-terminal similarity to 
10 the Salmonella virulence protein spvB (Gullig et al. 1992) (Fig. 5). The similarity of SepB 
and TcaC to SprB diminishes after SpvB amino acid residue 356. 

SepC showed strong similarity to the amino-terminal of the insecticidal toxin complex 
protein TccC, up to amino-acid residue 663 of SepC. A number of putative bacterial cell 
wall proteins also have high similarity to SepC, including the wall associated protein 
15 precursor B. subtilis (WAP A) and members of the E. coli Rhs (recombinant hot spot) 
elements. Strong similarity of SepC was also observed with hypothetical wall-associated 
proteins from Coxiella burnetti and Bacillus subtilis (Table 4). 

The translated sequences of ORF1 and ORF2 showed no similarity to sequences in the 
current databases. ORF3 shared significant similarity to the morphogenesis protein of the 
20 Bacillus subtilis bacteriophage B103, a member of bacteriophage muramidase-type lysis 
proteins (Pecenkova et al. 1996). However, relative to size, the gpl9 protein of S. 
typhimurium phage ESI 8 (146 amino-acid residues) or the nucD/regB phage lysozymes of 
S. marcescens (179 amino-acid residues) are more similar. ORF4 showed similarity to E, 
coli bacteriophage N15gp 55 protein, a protein of unknown function (Zimmer et al. 1998). 

25 Located in the same orientation as the sep genes and 134bp downstream of the SepC 
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termination codon is a 204 base pair region assigned ORF5, which has high similarity to a 

S. typhimurium re vol vase/in vert ase protein. However ORF5 is disrupted by two stop 

codons at amino-acid residues 19 and 64, making it unlikely that an active 

resolvase/invertase protein, is encoded by this region. A 256-bp region of encompassed by 

5 ORF5 (17498-17754) showed high similarity (77% identity) to the region (AF020806; 

1629-1885 bp) encoding S. typhimurium DNA invertase gene (Valdivia et al. 1997) 

suggesting a similar ancestral origin. 

Downstream of ORF5 and oriented in the opposite direction from 18935-18163 was a 870 
base pair region of DNA designated ORF6 whose product showed high amino-acid 

10 similarity over two different reading frames to the insertion element 1S9\ of E. coli 
(Mendiola et al. 1992). The translated sequence is interrupted at amino-acid residue 149 of 
the IS9\ element and later resumed on a second reading frame, before its similarity 
switched back to the original reading frame. Swtiching of ORF's is a common feature of 
members of the IS3 family where the transposase is encoded by this overlapping ORF's 

15 (Prere et al. 1990). However, the switch back to the initial strand is atypical. ORF6 may 
therefore be a dysfunctional relic of an ancestral IS element. It is unknown whether ORF6 
contains a ribosomal binding site as its predicted location would lie outside the sequenced 
region. There was no DNA similarity to the IS91 element. 

Analysis for v protein motifs showed that a tripeptide cell-binding motif Asp-Gly-Arg 
20 (RGD), implicated in the binding of various adhesion proteins produced by parasites and 
viruses to eukaryotic cells (Leininger et al. 1991), is present in SepA and the P. 
luminescens TcdA, and TcaB proteins (Fig. 4). The RGD motif is present in cell surface 
adhesions produced by the human pathogen Bordetella pertussis, namely the filamentous 
heamagglutinin (220 kDa) (Relrnan et al. 1989), and the outer membrane protein pertactin 
25 (69 kDa) (Leininger et al. 1991). These motifs have been implicated in enhancing the 
binding of B. pertussis to eukaryotic cells. Because the RGD motif found in SepA falls in a 
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region of high similarity between SepA and its P. luminescens counterparts, it may play a 

role in meditating the attachment of the protein and/or the bacteria to the insect cell wall. 

The hydropathicity profile of each of the Sep proteins was examined using the Kyte and 
Doolittle algorithm (Kyte and Doolittle, 1982) and compared to the relevant P. luminescens 
5 homologues. None of the Sep proteins contained a positively charged amino terminus 
followed by a hydrophobic region, characteristic of a signal sequence (Gierasch, 1989). 
The profiles of SepA, TcbA and TcdA were very similar (data not shown) and each 
exhibited a steep hydrophilic peak at the carboxyl terminus (residues 2055-2061 of SepA), 
specifically the protein sequence RRRRE (Fig. 4). Although both SepB and TcaC shared 
10 similarity to the Salmonella virulence protein SpvB, the amino-terminil of SepB and TcaC 
were hydrophilic as opposed to the hydrophobic nature of SpvB. The profile of SepC and 
its Photorhabadus counterpart TccC differed in that SepC had a slightly hydrophilic amino- 
terminus, whereas TccC lacked a hydrophilic amino-terminus and had a significantly 
hydrophobic carboxyl terminus from amino-acid residue 717 onwards (Fig. 3). 

15 Analysis to detect repetitive motifs characteristic of the RTX family of toxins (Welch, 
1991) using DOTPLOT showed only P. luminescens TccC contained a plot characteristic of 
a repeat motif present at the carboxy terminal (data not shown). 

Analysis of DNA composition (%GC) and similarity 

Comparisons of the GC content (Table 3) showed that the SepA and SepB genes were more 
20 GC-rich than their P. luminescens counterparts, while SepC and tcaC had similar GC 
content. The high GC content of SepC may be attributed to the close relationship of these 
protein products to the rhs family of wall-associated proteins which have a GC-rich core of 
62% (Wang et al. 1998). Comparisons of the GC content of the Sep genes with that of the 
S. entomophilia genome shows that they are rather similar, suggesting that the sep genes 
25 were not recently acquired by S. entomophilia. 
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Identification of mini-TnlO location by sequence analysis 

Analysis of the insertion points of the previously isolated m'mi-TnlO insertions (Fig. 2) 
within the putative ORFs (Table 4) revealed that ORF3 and ORF4 were interrupted by the 
-9, -23, -24 (ORF3) and -35 (ORF4) mutations. These insertions had no effect on the 
5 pathogenicity process, suggesting that ORF3 and ORF4 do not play a significant role in 
pathogenicity. However, the pADAP-35 mutation was at the 3' end of ORF4, resulting in 
the truncation of the final 1 1 aminoacid residues of ORF4 (Fig. 4), which may not have 
affected protein function. Further mutagenesis of ORF4 is therefore required to confirm 
that it has no role in pathogenicity. The mutations that caused loss of pathogenicity all 
10 resided within SepA, SepB or SepC No mutation mapped to ORF1 , ORF2 or ORFS. 

Complementation analysis of the sep proteins 

Following sequence data each of the Sep ORF's were excised as closely as possible with 
restriction enzymes, placed into pLAFR3 and placed in trans with the appropriate pADAP 
mutation. Complementation of SepA was undertaken through the use of the 8.5 kb HindUi 
15 clone (pMH45) which encompasses both ORF1 and SepA. SepB was excised as a 5.4 kb 
StuI fragment and SepC was excised as a 4.6 kb fragment using one of the peripheral; 
BamHl sites from the pBH32-13 mutation and the Stul site of pBM32 (Fig. 2b). 

Complementation analysis showed that pLAFR3 based SepB and SepC are able to 
complement their mutated pADK- counterparts. Grkovic et al. (1995) had already 
20 previously shown that SepC could complement itself. However, this was achieved through 
using the entire 1 1 kb HindHl, pGLA-20 fragment. 

Whether SepA is able to complement itself has yet to be fully established. It was found that 
-98% of the pMH45 construct was lost during the course of the bioassay. This latter result 
was sporadic and occasionally a repeated experiment would show the presence of diseased 
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grubs. Analysis of the macerates of these grubs showed that pMH45 was present indicating 

that pMH45 can possible complement SepA. However before further complementation 

analysis of SepA can be undertaken, measures to ensure the complementation plasmids 

stability are needed. 

5 Discussion 

The large conjugative plasmid, pADAP, of S. entomophilia encodes the genes responsible 
for cessation of feeding and gut clearance, characteristics of amber disease in the New 
Zealand grass grub G zealandica. This plasmid is present in all S. entomophilia and S. 
proteamaculans strains capable of causing amber disease (Glare et al. 1993) and had been 

10 implicated in disease processes (Grkovic et al. 1995). The applicant has defined a 16.9 kb 
region of kADAP that is sufficient to confer pathogenicity towards C. zealandica on 
pADAP-cured strains of S. entomophilia and on strains of E. coli. Hence, the region 
confers all the essential pathogenicity genes of S. entomophilia responsible for amber 
disease. Nucleotide sequence and mutagenesis analysis of the region revealed three genes, 

15 SepA, SepB and SepC, that together are sufficient for pathogenicity. Mutations in any of 
the three genes completely abolished the disease process and partial disease states were not 
detected, suggesting that the three genes may interact to exert an effect. 

The 23-kb region cloned into pBR322 to make pBM32 conferred pathogenicity in pADAP- 
cured S. entomophilia strains with all symptoms of amber disease being observed. 

20 Insertion mutants in pBM32 that abolished pathogenicity were transferred to pADAP. The 
resultant strains showed a partial disease phenotype, including anti-feeding but not gut 
clearance, suggesting that an additional anti-feeding gene may be present elsewhere on 
pADAP. The occurrence of two different anti-feeding genes on pADAP also supports data 
of Grkovic et al. (1995) who found that suppression of feeding was stronger in the wild- 

25 type pADK-6 strain, compared to the partial disease state (pADK-10 t pADK-13) of 
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inducing antNfeeding but no gut clearance. A putative anti-feeding gene, amb2, has already 

been isolated from the genomic DNA of S. entomophilia (Nunez- Valdez and Mahanty, 

1996). Recent data indicate that the amb2 locus resides at an as yet to be identified location 

on pADAP that is remote from the region identified herein (Hurst, unpublished data). 

5 Sequence analysis and comparison of the products of the sep genes showed that they share 
significant similarity to the proteins TcbA (TcdA, TcaB, TccB), TcaC and TccC that 
comprise the toxin complexes of P. luminescens. Like the P. luminescens genes that sep 
genes of Serratia share a similar organisational pattern of three genes ordered in succession 
in the same orientation, and opposed by a terminil gene transcribed in the opposite 

10 direction. However, the order of sep genes differ, are slightly smaller in size, and comprise 
constituents of each of the P. luminescens loci tea (tcaB=sepA, tcaC=sepB), luminescens 
toxin gene ted (Ensign et al. 1997) is also similar to SepA. The similarity shared between 
the sep and tc gene products suggests that they are members of a new family of insecticidal 
toxins. The lack of DNA similarity as opposed to protein similarity between sep and P. 

15 luminescens tc genes together with the differnce in GC content of the sepA and sepB genes 
compared to the tc genes, suggests that these genes were present in the common 
enterobacterial ancestor of P. luminescens and S. entomophilia and were not acquired by a 
more recent horizontal transfer event. 

The Photorhabadus toxins were isolated as a composite of proteins which are hypothesised 
20 to interact synergistically to form a toxin complex. The toxins are also able to exert an 
anti-feeding effect (Bowen et aL 1998; Bowen and Ensign, 1998). This is consistent with 
the results we obtained with the sep mutants. pADAP-cured S. entomophilia strains 
containing the pathogenicity clone pBM32 exert an anti-feeding effect on the grass grub 
and individual mutations within any of the sep genes have an identical phenotype, 
25 completely abolishing pathogenicity. The Photorhabadus toxins have a wide host range, 
affecting Lepidoptera, Coleoptera and Dictyoptera and undergo post translational 
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proteolytic processing (Bowen et al. 1998). No similarities of sep proteins were found to 

the Photorhabadus toxin component Tec A, and only the amino-terminus of TcaA shared 

similarity to SepA. This and the difference in the hydrophobicity profiles of SepC and 

TccC, may account for specificity of the sep proteins towards C. zealandica. However the 

5 sep proteins have yet to be purified and it is unknown whether the sep genes are expressed 

when S. entomophilia is ingested by other insects. Therefore the possibility that these 

newly-described toxins may exhibit a broader host range cannot be ruled out. 

The Photorhabdus toxin TcbA shares weak similarity to the Clostridium difficile A and B 
toxins (Bowen, 1998), but no such similarities were found to SepA. C. difficile A and B 

10 toxins belong to the RTX (repeats in toxin) family of toxins which are noted for the 
presence of several carboxyl terminal repeats (von Eichel-Streiber et al. 1992). A search of 
the sep proteins and their P. luminescens homologues for protein repeats showed that only 
the P. luminescens TcaC protein contained a repeat-type signature. The TcaC carboxy- 
terminal repeat bears little resemblance in size or number of repeats found in RTX toxins 

15 (von EichelrStreiber et al. 1992). SepA does not show weak similarity to the mosquitocidal 
toxin Cbm71 of C. bifermentans (Barloy et al. 1996). However when this region is 
compared with the relevant Photorhabdus homologues, it is a region with little similarity. 

SepB has strong similarities to both P. luminescens TccC and the Salmonella virulence 
gene product SpvB (Gulig et al. 1992). SpvB is believed to enhance the survival of virulent 

20 Salmonella in macrophages (Libby et al. 1997). It has been suggested that TcaC may act by 
attacking insect haemocytes (Bowen et al. 1998). However, haemocytes reside within the 
insect haemocoel and S. entomophilia does not invade the haemocoel until late in the 
infection process (Jackson et al. 1993), suggesting that SepB may act in some other way. 
The similarity of SepB and TcaC is high to SpvB but diminishes ten amino-acid residues 

25 upstream of the proline-rich region found in SpvB that is postulated to divide the protein 
into separate domains (Roudier et al. 1992). This may indicate a vital role for the amino- 



WO 01/16305 PCT/NZOO/00174 
terminus of both SepB and SpvB in interacting with an evolutionarily-conserved eukaryotic 

protein. 

The SepC protein shows high similarity to a family of cell wall-associated bacterial proteins 
such as the B. subtilis wall-associated protein (WAPA) and members of the E. coli rhs 
5 element family. The function of the Rhs proteins has yet to be established, but they are 
believed to be cell surface ligand-binding proteins (Hill et al. 1994). The Rhs proteins and 
the B. subtilis was-associated protein contain a characteristic repetitive peptide motif, but 
no such motif was observed in SepC. A feature of rhs elements is the presence of a 
downstream IS element (Wang et al. 1998). A degenerate IS91-type transposase element 

10 (ORF6) is present downstream of SepC. The IS91 element has been found associated with 
plasmids or chromosomal genes involved in a-haemolysin synthesis, and has been 
postulated to play a pivotal role in the spread of the a-haemolysin genes by means of the 
IS91 -mediated recornbinational activity (Zabala et al. 1984). It seems possible an IS 
element adjacent to SepC may have been involved in the acquisition of the sep genes by S. 

15 entomophilia. 

Blackburn et al. (1998) undertook histological examinations of the lepidopteran Manduca 
sexta after treatment with the P. luminescens Tea toxin complex introduced by feeding or 
haemcoelic injection. They found blebbing of the midgut epithelium into the lumen, 
resulting in lysis and formation of cavities. Similar histological studies have been 
20 undertaken at various stages throughout the infection cycle of S. entomophilia in C 
zealandica, and reveal a visible deterioration in the number of fat cells to almost minimal 
levels, and an emptying of the larval gut. However no blebbing of the midgut epithelium 
was observed (Jackson et ah 1993). 

The S. entomophilia pathogenicity region endows pathogenicity on members of the 
25 Enterobacteraceae such as Klebsiella spp. f Enterobacter agglomerans, E. coli, and Serratia 
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species (Glare et al. 1996), From this we can infer that the Sep proteins are the major 

virulence determinants, that the promoters of the sep genes are expressed constitutively or 

under the control of conserved regulatory genes, or a negative regulatory gene present in the 

pathogenicity region, and that export of the toxin proteins is carried out by a conserved 

5 chromosomally encoded system, or is an intrinsic property of the sep proteins. The Sep 

proteins have no obvious amino terminal signal sequences, a facet shared with E-Group 

colicins. The release of cloacin DF13 is mediated through a small lipoprotein designated 

BRP, for bacteriocin-release protein. Low level expression of BRP in conjunction with 

phospholipase A leads to the release of cloacin DF13, along with bacterial periplasmic 

10 proteins. However if expressed in high amounts, BRP causes cell death by cell lysis (vad 
der Wal, 1998). The close proximity and similar orientation pattern of ORF3 to the sep 
genes indicate that ORF3 may have an as yet to be determined important functional role. 
Protein similarity searches show that it has high similarity to the bacteriophage lysozyme 
family. In relation to amino-acid size, ORF3 closely resembles the LZBP22 lysozyme of 

15 the Salmonella P2 bacteriophage, a protein essential for the lysis of the bacterial cell wall 
(Rennell and Poteete, 1985). It is possible that ORF3 may facilitate the release of the sep 
proteins by lysing the bacterial cell wall. A low level expression of ORF3 might, as in the 
case of BRP, allow the passage of the sep proteins across the cell wall without causing cell 
death. The reason that the pBM32-9 and -24 mutations were unable to abolish the disease 

20 process could be due to a masking of ORF3 function by natural cell lysis of the bacteria. 

A region of repetitive DNA was identified between nucleotides 683 to 743, centered within 
a 1.2-kb AT rich stretch of DNA that contains no potential ORF's. The repeat motif is 
flanked by an upstream 13-bp palindrome and a degenerate downstream 33-bp palindrome. 
Repeats have been found to be common sites for recombination (Allgood et al. 1988), or to 
25 facilitate the binding of proteins. A 66-bp DNA sequence termed the rsk element for 
reduced serum killing, of the S. typhimurium 95-kb virulence plasmid, comprises of a series 
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of direct 10-bp repeats with a 21 nucleotide periodicity. The rsk element is believed to 

titrate out a trans-^oXmg factor, enhancing the expression of the Salmonella serum 

resistance gene (Vandenbosch et al. 1989). It is not known whether these repeats and/or 

flanking palindromes have a role in the pathogenicity process. The deletion derivative 

5 pAC24, which encompasses this region, was still pathogenic towards the grass grub. 

However, this deletion could also unknowingly remove the complete regulatory circuit of 

the pathogenicity region, leading to constitutive expression. 

THE ARABINOSE EXPRESSION SYSTEM 

Methodology 

10 Using the polymerase chain reaction (PCR) the initiation codon ATG of the three sep genes 
(sepA, sepB and sepC) were individually placed into the unique Ndel site (restriction 
enzyme site CATGG) of the HIS-tag arabinose expression vector pAV2-t0 (obtained from 
Chuck Shoernaker -AgResearch). Because large proteins i.e. greater than 50 kda are 
limited in their ability to bind to HIS tag affinity columns the carboxyl terminus of each of 

15 the Sep proteins did not need to be in frame with the HIS-tag site. Instead wild type DNA 
(non PCRd) containing a downstream chloramphenicol resistance gene was ligated into the 
appropriate restriction enzyme site (sepA Sunl; sepB HindJH; sepC BstXI) of the pAV2-10- 
sep derived vectors :- 

-the use of the chloramphenicol resistant marker provided by the vector pACYC184 
20 enhances the stability to each of the expression constructs i.e. -the antibiotic ampicilin to 
which the pAV2-10 is resistant too is cleaved in the media to an inactive form leading to 
possible plasmid free segregants arising. Conversely the antibiotic chloramphenicol is not 
cleaved heightening the level of plasmid stability under conditions of arabinose induction. 
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To validate the legitimacy of the fused genes to the arabinose expression vector, PCR 

generated products and the ligation junctions were verified by DNA sequencing. 

. Concurrent to this the sepB and sepC genes were placed as derived from pADAP 
downstream of sepA. Also sepA, sepB and sepC were placed as in pADAP downstream of 
5 orf3. This simulated wildtype conditions (i.e. the arrangement of the sep genes on pADAP) 
and hopefully get the production of the sep genes and the complex driven off the one 
upstream promoter. A method which Western analysis has shown to be successful -with 
moderate levels of sepA, sepB and sepC being detected. 

The arabinose expression system is one of the tightest systems known with almost complete 
10 abolition of gene product under arabiniose free conditions Guzman et al. (1995), this 
abolition can be enhanced by providing glucose to the medium. In contrast providing 
arabinose at the concentration of 0.2% will switch the arabinose promoter on express any 
genes under its control e.g. sepA etc. Typically an overnight culture of the E. coli strain 
was set up the next day an 100 \i\ of the culture was suspended in fresh media 
15 supplemented with chloramphenicol (30 \lglm\) the culture was grown until an OD of 400 
at which time arabinose was added to the culture to a final concentration of 0.2% and the 
culture left shaking at 30 °C for 18 hours. 

To date Western analysis has shown that each of the proteins is expressed and expressed to 
its correct predicted size: 

20 SepA 262.7 kdal 

SepB 156.6 kdal 

SepC 107 kdal 



37 




WO 01/16305 



PCT/N ZOO/00 174 



SepC is expressed at high levels with minor levels of proteolytic cleavage. However both 
SepA and SepB though expressed are cleaved in high amounts by endogenous E. coli 
proteases. Alternative strains of E. coli are going to be assessed for loss of proteolytic 
activity against SepA and SepB 

5 It has also been shown that placing all three of the sep genes under the control of a single 
arabinose promoter will result in the production of basil levels of the SepA, SepB, SepC 
toxin complex. 

Each of the following Coleopteran species were mouth injected with 3-5 \l\ of an overnight 
suspension of induced bacteria (£. coli strain DHB101) containing either SepA, SepB and 
10 SepC or orf3, SepA, SepB and SepC. 

Each larvae was then given a 3mm 3 piece of carrot coated with a 50% solution (dF^O) of 
arabinose. Observations were noted each day and the larvae refed with a 3mm 3 piece of 
carrot coated with a 50% solution (dH 2 0) of arabinose 



Odontara 

Grass grub (positive control) 

Under these Conditions it has been found that the arabinose expressed toxin complex SepA, 
SepB and SepC is active against grass grub but not any of the other species of scarabs 
20 tested (see above). It is therefore thought unlikely that the toxin complex will have activity 
to other insect orders. 



Red headed cock chaffer 



15 



Tasmanian grass grub 
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SUMMARY 

The bacteria Serratia entomophilia and S. proteamaculans cause amber disease in the grass 
grub, Costelytra zealandica (Coleoptera: Scarabaeidae), an important pasture pest in New 
Zealand. Larval disease symptoms include amber colouration, clearance of the gut and 
5 rapid cessation of feeding, before eventual death. The region containing pathogenic 
determinants of the disease has been cloned, and further defined by mutagenesis and 
deletion analysis to a 16.9 kb region. Sequence analysis of the minimal pathogenic 
encoding region showed significant protein homology, but little sequence homology to a 
group of newly described toxins from a member of the Enterobacteriaceae, Photorhabadus 
10 lutninescens. This pathogenicity-encoding region from S. entomophilia plasmid pADAP is 
the subject of the invention. The proteins encoded by the genes (sepA, sepB, sepC) within 
the 16.9 kb region can be used for insect control whether as an inundative pesticide, within 
baits or expressed in other organisms such as plants or microbes. 

Aspects of the present invention have been described by way of example only and it should 
15 be appreciated that modifications and additions may be made thereto without departing 
from the scope thereof as defined in the appended claims. 
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Tabic 1 Bacterial strains, plasmids and bacteriophage used in the study 



Bacteria 

Escherichia coli 
DH5a 



Description 



DH10B 



DF1 

MC1061 
MC4100 



XLl-BlueMRA 



F <(>80d lacZpM 1 5 p(/acZYA-ar#F)U 1 69 recK 1 
endAl supE44 

FmcrA p(mrr-/tf</RMS-/ncrBC)<|>80d /acZpMIS 
placX74 endAl recAl deoKp{ara. leu) 7697 
araD\39 gal\J galK nupG rpsL X\ 
yd transposase(/npv4) 

sup° hsdR mcrB araD\39 p(araA BC-leii)1619 
placXIA galUgalKrpsL thi 
araD139 p(tacZYA-argF)U\69 rpsL150 

relAl flbB5301 deoCl ptsF25 
rbsK 

p(mcrA)\%3 p(mcrCB-hsdSMR-mrr)173 endAl 
supE44 thUJ reAl gyrA96 relAl 
Serratia entomophila 
A1M02 Ap R , pADAP, pathogenic, 

5.6 heat cured pADAP minus derivative of A 1 M02 

5.6RC Cm R recA' pADAP minus strain 

5.6RK Kn R recA' pADAP minus strain 

Plasmids 

pACYC184 Cm R Tc R 

pADAP Amber disease associated plasmid 

pBR322 Ap R , Tc R 

pBM32 23-kb BamHl fragment from pMH32 cloned in 

pBR322 

pBM32-l -40 pBM32 containing mini-7h/0 insertions 
pDELTAl Ap R , Sm R , Kn R , sucrose R 

pLAFR3 Tc R pRK290 with Xcos, lacZa and multi- 

cloning site from pUC8. 
pRK20 1 3 IncP, Kn R Tra RK2 repRKl repEl 

pGLA20 10.6-kb HindlU pADAP fragment cloned in 

pLAFR3 

pACp4 19-kb BamHl fragment from pBM32-4 cloned in 

P ACYC184 

pACpS 1 7-kb BamHl fragment from pBM32-8 cloned in 

pACYC184 

pACplO 19.5-kb BamHl fragment from pBM32-lO 

cloned in pACYC184 
pACp20 20-kb BamHl fragment from pBM32-20 cloned 

inpACYC184 

pACp23 2 1 -kb BamHl fragment from pBM32-23 cloned 

inpACYC184 

p ACp24 2 1 .2-kb BamHl fragment from pBM32-24 

cloned in pACYC184 
pADK-10 pADAP::mini-Tn/0 insertion in 10.6-kb HindUl 

fragment* Kn R non-pathogenic 
pADK-13 pADAP::mini-Tn70 insertion in 10.6-kb 

HindUl fragment, Kn R non-pathogenic 
pADK-35 pADAP::mini-Tn/0 insertion in 10.6-kb HindUl 



Reference 



Hanahan(1983) 

Lorow and Jessee, 
(1990) 

Gibco BRL 
Casadaban and Cohen, 
(1980) 
Silhavy et al. 
(1984) 

Stratagene 



Grimont et al (1988) 
Glare etal (1993) 
Grkovic et al (1996) 
this study 



Chang and Cohen, 
(1978) 

Glares al. 1993) 
Bolivar et al (1977) 
this study 

this study 
Gibco BRL 

Staskawicz et al ( 1 987) 

Ditta etal (1980) 
Corbett (unpublished) 

this study 

this study 

this study 

this study 

this study 

this study 

Grkovic etal (1995) 
Grkovic efa/.(1995) 
Grkovic etal (1995) 
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pMH32 
pMH41 
pBM32 
pUC19 



Bacteriophage 
XNK1316 



fragment, Kn R , pathogenic 

23-kb BamHl frgament of pADAP cloned into 

pLAFR3 

33-kb BamHl fragment of pADAP cloned into 
pLAFR3 

23-kb BamHl fragment of pMH32 cloned into 
pBR322 

Ap R , /acZa, multi-cloning site 



mini-TnVO derivative 103 donor Xb522 cl857 
Pam80 nin5 



this study 
this study 
this study 

Yannish-Perron, ei al. 
(1985) 

Kleckner et al (1991) 



Table 2 Position of genes and features of the predicted gene products encoded by sep genes 



ORF 


Putative ribosome-binding site' 


Longest potential coding 
region 


sep %GC 
(P. titminscens 
homologue, %GC) 


Start at 
nucleotide 


Stop at nt 
(ORF size 
bp) 


sepA 


ATGGGACCATCAACGTAATGAA 
TGAGG 


2413 


9547 
(7131) 


54 

(tcbA,43\tcdA r 44) 


sepB 


CGAGGAGACTGAGCATGCAA 


9598 


13885 
(4287) 


58 

(rcoC, 51) 


sepC 


ACAGGAGATCACATGAGC 


14545 


17467 

(2922) 


55 

(tccC, 54) 


ORF I 


CATAGAGACTGTCGCTATGTTA 


1287 


1587 
(300) 


39 


ORF2 


TTGGAGAATAACCGCCATGTT 


1590 


1863 
(273) 


39 


ORF3 


GGGGGAGAAAAATGAAG 


1860 


2294 
(435) 


51 


ORF4 


TGACTGGGAAGGAGGGGGGGAC 
GGTGATGAGT 


13908 


14483 
(576) 


60 


ORF5 


TAACGAGACTTTTTAGCAAAAT 
GGCACTTT 


1761-1755, 1755-1773 


? 


ORF6 


GAGCATGGC-Mini-Tn / 0-8 * 


18934-18064 


7 



• Putative ribosome-binding sites are underlined, and potential start codons are in boldface; nt, nucleotides; ? 
degenerate or incomplete ORF. * ORF transcribed in opposing direction. 



Table 3. Comparisons of GC content between the Sep and P. luminescen genes 



Sep (%GC) 


P. luminescen toxin (%GC) 


BepA (54%) 


CabA (43%) tcdA (44%) 


sepfl (56%) 


tcaC (51%) J 


BGpC (55%) 


tccC (54%) 
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Table 4. Similarities of products of putative ORF's to protein sequences in the database 



detected using BlastP 



ORF 
(a.a size) 


Protein 
homo- 
logue (a.a 
size) 


Degree of similarity 
%identity/%similarity 
(over) a.a residue - a.a 
residue 


Function of the homologous 
protein 


Organism 


Blast score 
Reference* 


SepA 
(2373) 


TcbA 
(2504) 


34/50(1675)41-1628* 
57/72 (751) 1630-2374* 


insecticidal toxin complex 
protein 


Photorhabdus 
luminescens 


0.0 

AF047457 




TcdA 
(2405) 


40/55 (2458)* 


insecticidal toxin complex 
protein 


P. luminescens 


0.0 

Ensign et al, 
(1997) 




TcaB 
(1189) 


38/54(764) 1625-2374* 
29/50 (281)936-1198* 


insecticidal toxin complex 
protein 


P. luminescens 


e" 7 

AF046867 




TccB 
(1565) 


36/51 (859) 1575-2373* 
31/51(289)930-1204* 


insecticidal toxin complex 
protein 


P. luminescens 


AF047028 




TcaA 
(1095) 


36/56(90)94-183* 
18/39(530)435-928* 


insecticidal toxin complex 
protein 


P. luminescens 


le* 

AF046867 




TccA 
(965) 


27/45(186)115-280* 


insecticidal toxin complex 
protein 


P. luminescens 


5e^ 

AF047028 




Cbm71 
(613) 


24/41 (199) 1057-1250* 


Mosquitocidal toxin Cbm71 


Clostridium 
bifermentans 


g2127309 


SepB 
0428) 


TcaC ^ 
(1485) 


49/63(1276)1-1263* 
64/78(152) 1270-1421* 


insecticidal toxin complex 
protein 


P. luminescens 


0.0 

AF046867 




SpvB 
(591) 


40/52(357)9-365* 


Salmonella virulence protein 


Salmonella 
typhimurium 


S22664 


SepC 
(938) 


TccC 
(1043) 


53/66 (836) 3-782* 


insecticidal toxin complex 
protein 


P. luminescens 


0.0 

AF047028 




SC2H4.02 
(2183) 


23/34 (639) 68-677* 


Hypothetical wall associated 
protein 


Streptomyces 
coelicolor 


2e u 

AL031514.1 




WapA 
(2334) 


22/34(430)255-677* 
20/36 (613)48-625* 


Wall associated protein 
Precursor 


B\ subtilis 


2e $ 

S32920 




Y15898 
(334) 


21/34(542) 181-684* 


hypothetical wail associated 
protein 


Coxiella burnetii 


9e" 5 

Y15898 




Rhs core 
(1420) 


21/35(463) 237-677* 
21/36(285) 35-300* 


Rhs core protein 


E. coli 


3e^ 

AF044501 


ORF3 
(144) 


BB103G 
(263) 


45/62(142) 1-139* 


morphogenesis protein of 
bacteriophage B103 


Bacillus subtilis 


CAA67646 




LZBP22 
(146) 


46/61 (139) 1-143 


Phage P22, lysozyme (E 
3.2.1.17) 


Salmonella 


lc» 

gi 138699 


ORF4 
(191) 


Gp55 
(181) 


28/42(188)1-184* 


bacteriophage N15 protein 


E. coli 


le* 

AF064539 


ORF5 
(236) 


SprA 


75A79(68) 1-68 ♦ 


Resolvase/invertase homologue 


S. typhimurium 


7e w 

AF029069 
AF020806 


ORF6 
(310) 


ISP/ 


39/56(94) 130-197* T 
39/58 (94) 224-318 ♦ T 
30/48 (76) 31 9-395 ♦ V 


ISP/ transposase 


E. coli 


4c 2 * 
S23782 



Percent identities and similarities were calculated in relation to the deduced gene products of the sequenced 
ORF. * indicates position of amino-acid similarity in relation to sequence generated in this study. ♦ indicates 
position of amino-acid similarity in relation to data base protein sequence. * reading frame. ■ similarities were 
considered potentially significant if the BlastP score exceeded e" 5 . 



'.ipi • Di "F 12 nhS 9 ,,. :D! Vi 3 .:1 'IF 'lyl 21 
WO 01/16305 PCT/NZ00/00174 

48 



Table 5 Positions of mini-Tn/0 insertions 



Mini-Tni0 


ORF 


Position downstream of 


insertion # 




initiation codon (bp) 


9/23 




i on 


24 


UKrJ 




4 


sepA 


tH 1 


27 


sepA 


1037 


40 


sepA 


1097 


6 


sepA 


1727 


38 


sepA 


2887 1 


2 


sepA 


3197 


5 


sepA 


3737 


3 


sepA 


3697 


19 


sepA 


3697 


30 


sepA 


4467 1 


37 


sepA 


4467 


31 


sepA 


4627 


12 


sepB 


182 


22 


sepB 


172 


11 


sepB 


362 


10 


sepB 


2162 


35 


ORF4 


557 


13 


sepC 


2525 


8 




18937 


ORP4/-35 junction GGG CGC TGA TGA ATC 
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The Claims Defining the Invention are? 

1 . A purified and isolated nucleic acid molecule comprising a nucleotide sequence of SEQ ID 
NO: 1 that encodes at least one of: 

(i) an insecticidal protein complex, or 

(ii) a functional fragment of said complex, or 
(ni) a neutral mutation of said complex, or 
(iv) a homolog of said complex, 

each of which have at least 75% nucleic acid homology to SEQ ID NO: 1 and are capable 
of hybridising with said nucleic acid molecule under stringent hybridisation conditions. 

2. A purified and isolated nucleic acid molecule as claimed in Claim 1 comprising the 
nucleotide sequence 1995-18937 of SEQ ID NO: 1. 

3. A purified and isolated nucleic acid molecule as claimed in Claim 1 comprising one or 
more of the nucleotide sequences 241 1-9547, 9589-13883 or 14546^17467 of SEQ ID NO: 
1. 

4. A purified and isolated nucleic acid molecule as claimed in Claim 3 comprising all of 
nucleotide sequences 241 1-9547, 959S^13884 and 14546-17467 of SEQ ID NO: 1. 

5. A purified and isolated nucleic acid molecule as claimed in Claim 1 comprising a sequence 
of SEQ ID NO: 1 , operably linked to at least one further nucleotide sequence which 
encode an insecticidal protein. 

6. A purified and isolated nucleic acid molecule as claimed in Claim 2 comprising 
nucleotides 1955-18937 of SEQ ID NO: 1, operably linked to at least one further 
nucleotide sequence which encode an insecticidal protein. 
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7. A purified and isolated nucleic acid molecule as claimed in Claim 3 comprising a sequence 
of SEQ ID NO: l,or one or more of nucleotides 2411-9547, 9598-13884 or 14546-17467 
of SEQ ID NO: 1, operably linked to at least one further nucleotide sequence which 
encode an insecticidal protein, 

8. A purified and isolated nucleic acid molecule as claimed in any one of claims 4 through 6 
wherein the said nucleotide sequence includes the nucleotide sequence which codes for at 
least one of the Bacillus delta endo toxins, vegatative insecticidal proteins (vips), 
cholesterol oxidases, Clostridium bifermentens mosquitocidal toxins and/or 
Photorhabadus luminescens toxins. 

9. A purified and isolated nucleic acid molecule as claimed in claim 1 wherein nucleic acid 
molecule may comprise DNA, cDNA or RNA. 

10. A purified and isolated nucleic acid molecule as claimed in claim 1 wherein the nucleic 
acid molecules said fragment, neutral mutation or homolog thereof capable of hybridising 
to said nucleic acid molecule, hybridise to the nucleotide sequence of SEQ ID NO: 1, or 
nucleotides 1955-18937, 2411-9547, 9598-13884 or 14546-17467 of SEQ ID NO: 1 if 
there is at least 75% or greater identity between the sequences. 

11. A purified and isolated nucleic acid molecule as claimed in claim 1 wherein the nucleic 
acid molecule may be isolated from Serratia entomophila or Serratia proteamaculans 
strains of bacteria. 

12. A recombinant expression vector(s) containing the nucleic acid molecule as claimed in 
Claim 1 and host transformed with the vector expressing a polypeptide. 

13. A recombinant expression vector(s) as claimed in claim 1 1 wherein the vector is selectable 
from any suitable natural or artificial plasmid/vector. 

14. A recombinant expression vector(s) as claimed in claim 13 wherein said suitable natural or 
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artificial plasmid/vector, including, pUC 19 (Yannish-Perron et al. 1995), pProEX HT 
(GibcoBRL, Gaithersburg, MD, USA), pBR322 (Bolivar et al 1977), pACYC184 (Chang 
et al. 1978), pLAFR3 (Staskowicz et al. 1987). 

15. A polypeptide resulting from the transformation or transfection of a host cell with a 
recombinant expression vector as claimed in any one of Claims 12 through 14, 

1 6. A method of producing a polypeptide of claim 15 comprising the steps of: 

(a) culturing a host cell which has been transformed or transfected with said vector as 
defined above to express the encoded polypeptide or peptide; and 

(b) recovering the expressed polypeptide or peptide, 

17. The use of a ligand that binds to a polypeptide of claim 15 to isolate and/or identify the 
polypeptide of claim 15. 

18. An antibody or antibody binding fragment that binds to a polypeptide of claim 15. 

19. Probes and primers comprising a fragment of the nucleic acid molecule as claimed in 
Claim 1 wherein said fragment is hybridisable under stringent conditions to a native 
insecticidal gene sequence. 

20. Probes and primers comprising a fragment of the nucleic acid molecule as claimed in 
claim 19 wherein said probes and primers enable the structure and function of the gene to 
be determined and homologs of the gene to be obtained from bacteria other than Serratia 
sp. 

21. A polypeptide as claimed in Claim 15 wherein the polypeptide has insecticidal activity 
encoded by the nucleic acid molecule of claim 1, or a functional fragment, neutral 
mutation or homolog thereof. 

22. A polypeptide having insecticidal activity as claimed in claim 21 wherein the polypeptide 



AMENDED SHEEl 
1PEA/AU 



WO 01/16305 PCT/NZOO/00174 

52 

comprises the amino acid sequence of SEQ ID NO: 1 or a functional fragment, neutral 
mutation or homolog thereof. 

23. A polypeptide having insecticidal activity as claimed in claim 21 wherein the polypeptide 
comprises amino acids 32-5 1 1 8 of SEQ ID NO: 1 . 

24. A polypeptide having insecticidal activity as claimed in claim 21 wherein the polypeptide 
comprises at least one amino acid sequence of SEQ ID NO: 2; SEQ ID NO: 3; SEQ ID 
NO: 4; SEQ ID NO: 5 or SEQ ID NO: 6. 

25. A polypeptide having insecticidal activity as claimed in claim 24 wherein the polypeptide 
preferably comprises amino acid sequence SEQ ID NO: 4; SEQ ID NO: 5 and SEQ ID 
NO: 6. 

26. A polypeptide having insecticidal activity as claimed in claim 24 wherein the polypeptide 
preferably comprises all of SEQ ID NOs: 2-6. 

27. A polypeptide having insecticidal activity as claimed in claim 21 wherein the polypeptide 
is obtained by expression of a DNA sequence coding therefore in a host cell or organism. 

28. A polypeptide having insecticidal activity as claimed in claim 27 wherein the polypeptide 
comprises the amino acid sequence of SEQ ID NO: 1 linked to at least one further amino 
acid sequence encoding an insecticidal protein. 

29. A polypeptide having insecticidal activity as claimed in claim 28 wherein the at least one 
further amino acid sequence includes the amino acid sequence which codes for Bacillus 
delta endo toxins, vegatative insecticidal proteins (vips), cholesterol oxidases, Clostridium 
bifermentens mosquitocidal toxins and/or Photorhabadus luminescents toxins. 

30. A polypeptide having insecticidal activity as claimed in claim 28 wherein the polypeptides 
comprise at least 50%, preferably 60%, more preferably 70% and most preferably 90-95% 
or greater identity to SEQ ID NO: 1. 
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31. A polypeptide having insecticidal activity as claimed in claim 21 wherein the polypeptide 
is produced by expression of a vector comprising the nucleic acid of SEQ ID No: 1 or a 
functional fragment, neutral mutation or homolog thereof, in a suitable host cell. -i : 

32. An insecticidal composition comprising at least the polypeptide as claimed in claim 21 and 
an agriculturally acceptablecarrier. 

33. An insecticidal composition as claimed in claim 32 wherein more than one polypeptide is 
included in the composition. 

34. An insecticidal composition as claimed in claim 32 or 33 wherein the composition 
comprises additional pesticides, including compounds known to possess herbicidal, 
fungicidal, insecticidal or nematicidal activity. 

35. An insecticidal composition as claimed in claim 34 wherein the composition comprises 
other known insecticidally active agents, including Bacillus delta endo toxins, vegatative 
insecticidal proteins (vips), cholesterol oxidases, Clostridium bifermentens mosquitocidal 
toxins and/or Photorhabadus luminescents toxins. 

36. A method of combating pests, said method comprising applying to a locus, host and/or the 
pest, an effective amount of the polypeptide as claimed in Claim 21 that has functional 
insecticidal activity against said pest. 

37. A method of inducing amber disease or like condition in insects comprising delivery to an 
insect an effective amount of the polypeptide as claimed in Claim 21 that has functional 
insecticidal activity against said insect. 

38. A method of inducing amber disease or like condition in insects as claimed in claim 37 
comprising delivery to an insect an effective amount of the polypeptide wherein the insect 
is selected from the order comprising Coleoptera. 

39. A method of inducing amber disease or like condition in insects as claimed in Claim 38 
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comprising delivery to an insect an effective amount of the polypeptide wherein the insect 
includes Costelytra zealandica (Coleoptera: Scarabaeidae). 

40. A method of delivering the insecticidal polypeptide to induce amber disease or like 
condition in insects including delivery of the insecticidal polypeptide as claimed in Claim 
39 to the insect by any one of presenting the insecticidal polypeptide orally as a solid bait 
matrix, as a sprayable insecticide sprayed onto a substrate upon which the insect feeds, 
applied directly to the soil subsurface or as a drench or is expressed in an transgenic plant, 
bacterium, virus or fungus upon which the insect feeds. 

41. A transgenic plant, bacterium virus or fungus, incorporating in its genome, a nucleic acid 
molecule as claimed in Claim 1 for providing the plant, bacterium virus or fungus with an 
ability to express an effective amount of an insecticidal polypeptide. 
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(54) Title: NUCLEOTIDE SEQUENCES ENCODING AN INSECTIDAL PROTEIN COMPLEX FROM SERRAT1A 

O (57) Abstract: The present invention concerns novel nucleotide sequences encoding proteins from the Enterobacteriaceae, Serra- 
£2 tia entomophiia and Serratia proteamaculans* and the use of said nucleotide sequences and proteins for inherent insecticidal and 
^2 potentially metazoacidal properties. The invention relates to an isolated nucleic acid molecule compn&ing a nucleotide sequence 
^! that encodes an insecticidal protein complex, or a functional fragment, neutral mutation, or homolog thereof capable of hybridising 
with the nucleic acid molecule under standard hybridisation conditions. The nucleotide sequences include a pathogenicity-encoding 
° region cloned from bacteria Serratia eniomophilia and S. proteamaculans > The region contain pathogenic determinants of a disease 
© that affect the grass grub, Costelytra zealandica Coleoptera: Scarabaeidne, an important insect pasture pest in New Zealand. The 
^ proteins encoded by determined genes may be used for insect control whether as an inundative pesticide, wjihin baits or expressed 
^ in other organisms such as plants or microbes. 
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Atty Docket No. 24747-1 1 04US 

DECLARATION FOR PATENT APPLICATION 

As a below-named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated below next to my name. 

I believe I am the original, first inventor of the subject matter which is claimed and for 
which a patent is sought on the invention entitled: 

NUCLEOTIDE SEQUENCES ENCODING AN INSECTIC1DAL 
PROTEIN COMPLEX FROM SERRATIA 

the specification of which 

( ) is attached hereto. 

( } was filed by an authorized person on my behalf on as 

Application Serial No. 

{X) was filed as PCT Application Serial No. PCT/NZOO/00174 on 

04 September, 2000 . 
(X) amended in a Preliminary Amendment filed March 1 r 2QQ2 . 

I hereby state that I have reviewed and understand the contents of the above-identified 
specification, including the claims as amended by any amendment referred to above. 

I acknowledge the duty to disclose information which is material to the examination of 
this application in accordance with Title 37, Code of Federal Regulations, s 1.56(a). 

I hereby claim foreign priority benefits under Title 35, United States Code, §1 19(a)-(d) 
or § 365(b) of any foreign application^) for patent or inventor's certificate listed below 
and so identified, or §365(a) of any PCT international application that designated at 
least one country other than the United States of America, listed below, and I have also 
identified below any foreign application for patent or inventor's certificate or PCT 
international application on this invention filed by us or our legal representatives or 
assigns and having a filing date before that of the application on which priority is 
claimed. 

Priority 
Claimed 

Number Country Dav/Month/Year Filed (Yes or No) 

337610 NEW ZEALAND 02 September 1 999 Yes 

J hereby claim benefit under Title 35, United States Code, §1 19(e) of any United States 
provisional application(s) listed below: 

Application Serial No. Filing Date 

N/A 
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I hereby claim the benefit under Title 35. United States Code, §120 of any United 
States application(s) listed below and, insofar as the subject matter of each of the 
claims of this application is not disclosed in the prior United States application in the 
manner provided by the first paragraph of Title 35, United States Code, §112, I 
acknowledge the duty to disclose material information as defined in Title 37, Code of 
Federal Regulations, 51.56(a) which occurred between the filing date of the prior 
application and the national or PCT international filing date of this application: 

Application Serial No. Filing Date Status 



1 hereby declare that all statements made herein of my own knowledge are true and 
that all statements made on information and belief are believed to be true; and further 
that these statements were made with the knowledge that willful false statements and 
the like so made are punishable by fine or imprisonment, or both, under Section 1001 
of Title 1 8 of the United States Code and that such willful false statements may 
jeopardize the validity of the application or any patent issued thereon. 

I hereby appoint the following attorneys and agents, with full power of substitution and 
revocation, to prosecute this application and to transact all business in the United 
States Patent and Trademark Office connected therewith and request that all 
correspondence and telephone calls in respect to this application be directed to 
Stephanie Seidman, HELLER EHRMAN WHITE AND McAULlFFE LLP, 4350 La Jolla 
Village Drive, 7th Floor, San Diego, California 92122-1246; 858-450-8400: 



N/A 



PCT Application No. 
N/A 



Filing Date 



Status 



Attorney 



Reg. No. 



Stephanie Seidman 
Paula K. Schoeneck 
Dale L. Rieger 
Robert T. Ramos 




43,045 



and other members of the firm. 



Address for correspondence; 



Stephanie Seidman 

HEL LER EHRMAN WHIT E AND McAULlFFE^ LLP 
4350 La J olla Viil age^D rjye 
7th Floor 

San Diego, California 92122-1246 
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Full name of inventor: 
Inventor's signature: 
Date: 

Residence; 

Post Office Address: 

Citizenship: 
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Travyi^Rohert G lare 



Halswell. New Zealand 
3fi Whinco^bs Road_ 



HalswelL Christchumh^J^ w Zealand 



Australia, 



Full name of inventor: 
Inventor's signature: 
Date: 

Residence: 

Post Office Address: 

Citizenship: 



Mark R^hin Holmes Hurst, 



Hnon Hav. New Zealand 
TZHTH end Arsons Road 



henaersu r^ nuau s — „ 

HgoTLHav, Christchurch. New Zeala nd 
Nftw Zealand „ = — 
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Full name of inventor: 
Inventor's signature: 
Date: 

Residence: 

Post Office Address: 

Citizenship: 



Trgvor Anthony 




Christchurc h,Jjew Zealand 
407 HalswelfRoad, 



tJLK 



Christchurch, New Ze aland, 
New Zealand 
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<110> Glare, Travis T 
Hurst, Mark R H 
Jackson, Trevor A 

<120> Insecticidal Nucleotide Sequences 



<130> 24747-1104US 

<140> USlO/070,489 
<141> 2002-03-01 

<150> PCT/NZOO/00174 
<151> 2000-09-04 

<150> NZ 337610 
<151> 1999-09-02 

<160> 6 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 18937 
<212> DNA 

<213> Serratia entomophila 

<220> 
<221> CDS 

<222> (2411) . . . (9547) 
<223> SepA 

<221> CDS 

<222> (9598) . . . (13884) 
<223> SepB 

<221> CDS 

<222> (14546) . . . (17467) 
<223> SepC 

<221> CDS 

<222> (1860) . . . (2294) 
<223> ORF1 

<221> CDS 

<222> (13908) . . . (14483) 
<223> ORF2 

<221> misc_feature 
<222> (1955) . . . (18937) 

<223> MINIMUM SEQUENCE REQUIRED FOR PATHOGENICITY AS 
DEFINED BY DELETION AND TRANSPOSON MUTAGENESIS 

<400> 1 

ggatccgagt gaaggaatca tcggccgctt tatacgtttc agggtgaata cggttggccg 6 0 

caacgtggca atggatgttg tttgtgtcgg tatgaatcgc cgcaacgtac tggtgttctg 12 0 

acatacccag tgccgataaa ctgtgacgaa cactatcaaa gatgtgttcc gtcgacctga 180 

aagccaggat ttatttttac accaatggtt gggtgggctt cctttctgaa ctggtgcatc 240 

atttagccgg catcatcaaa agatgcatgg aaatacaaat atcatattta cagacaccca 300 

agttgatgac ctgctccgtg agttgaaatg ccgacggggg aaatcagcag ccttttcaac 360 

tcatggagca gggggaaatc aatcctcaat aacccgcatt ggatatcctg ccagtgtgca 420 

tttaaccttt ttagtgtgtt tccttaatat cccaatcgtt gaatcgctac atacggcaga 480 

cattagtatc tcacttatca tcaaagtaat atcacaccga gaatgctaat ttcatgatat 540 

gaaaacgttc cattaataaa ttttcagaaa cctaacacgg catttttatg ctgatcagtg 600 

aattgattgt ttctgaaaaa attaattgca cctctgccac ttatcagata aaaacacccc 660 
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atgcggtaag ttttttattt tttattaatg attttattaa tgattttatt aatgatttta 720 

ttaatgattt tattaatgat tttactatag atgaatgtta acatgggtga taatttactt 780 

tactcaattt aattgttggt atgaccatgt tttagatgag tggcacggat tcattattgt 840 

aaaaaaagta tctaaaacct ttagcagcaa tcctacttga ggatgacctc gacaggactt 90 0 

gattattgcc attttttacg aaggaagatg acgggtgata aataataaaa aaaacaaaag 960 

tatagcctta ggtatcgccg attacatcca gtaacactta ttgacttttt fettacttcta 1020 

ccgttagcta taaatatgat atttaaatct gtatttttat ataaaaccag tttatgatgc 1080 

tggattggtc attaaagtcg ttatatgtga tcgttatctg tcattgattg gtgtttaatc 1140 

ttttattctt ccagtgaggt ttcaggggga atgtattggg taatcatact catgtcattt 12 00 

gttgctttga tgttaaatta acgtgttcat tcattatgtt ctactgttgt ttctattgtc 1260 

cggaacgacc atagagactg tcgctatgtt aataggaata tttgactggt tatatgcgcc 132 0 

aagggttatc gctgcactct ctggggcgat ggtattcatc attacgcaag ataacttcat 13 80 

tggtgtcaga cgggtgttat tgttttttgt gtctttttta ctcggtttga cattttcaga 1440 

gacaacagct tccgttatca acttctatat cccgaatgat atacatatag gaaatgacct 1500 

tggtgccttt gttaccagcg ccgtgacggt gaagcttttt gttatcatta tgagcaagat 1560 

agagagaaaa tatcttggag aataaccgcc atgttccaaa tcatacttct taatgttaat 162 0 

gccgtgattt gcttggctat tgccgtcaga ttattcctgt ggcgtatcaa tcataaaatg 1680 

aaaaacattg tcgtctcttt tattgctttt ctcattatta cggcgtgcgg cgctgtctcc 1740 

atcaggacga tgacggggga gtattactat gcggattggt ccgagacgat cattaacctt 1800 

tcgcttttcc tgtctgttta tatacgcaat ggcgaaatcc ttcggtgggg ggagaaaaa 1859 

atg aag ata agt tec cga ggt ate gca tta ate aaa gag ttc gaa ggt 1907 
Met Lys lie Ser Ser Arg Gly lie Ala Leu lie Lys Glu Phe Glu Gly 
1 " 5 -10 15 

ctg cgc tta cac get tat cgc tgc gee get gac gtc tgg act gtc ggt 1955 
Leu Arg Leu His Ala Tyr Arg Cys Ala Ala Asp Val Trp Thr Val Gly 
20 25 30 

tat ggc cac acg gca ggg gtt aca aag ggt gac ate ate acg gtc gat 2 003 
Tyr Gly His Thr Ala Gly Val Thr Lys Gly Asp lie lie Thr Val Asp 
35 40 45 

gaa gee cag acg atg ctg' aca aac gat att acc gta ttt gaa egg gcg 2051 
Glu Ala Gin Thr Met Leu Thr Asn Asp lie Thr Val Phe Glu Arg Ala 
50 55 60 

gtc agt cag gee gtc gcg gtt cct ctg aat cag teg caa tac gat gec 2 099 
Val Ser Gin Ala Val Ala Val Pro Leu Asn Gin Ser Gin Tyr Asp Ala 
65 70 75 80 

ctg gtt tct ttg gtt ttt aat att ggc cag ggg aat ttt aaa cgc tct 2147 
Leu Val Ser Leu Val Phe Asn lie Gly Gin Gly Asn Phe Lys Arg Ser 
85 90 95 

acc ttg ttg aaa aaa etc aac aaa cag gac tat gtc ggc gee ggg aac 2195 
Thr Leu Leu Lys Lys Leu Asn Lys Gin Asp Tyr Val Gly Ala Gly Asn 
100 " 105 110 

gag ttt tta cgc tgg acc egg gee aat ggg aag gtc ctt ccc gga ctg 2243 
Glu Phe Leu Arg Trp Thr Arg Ala Asn Gly Lys Val Leu Pro Gly Leu 
115 " 120 125 

att cgc cga cgc gaa get gaa egg gtg ttg ttt gag aaa ctg ggt gca 2 2 91 
lie Arg Arg Arg Glu Ala Glu Arg Val Leu Phe Glu Lys Leu Gly Ala 
130 ~ 135 140 

taa ccctttgcga cgtacccaca agatgaagat aacaccgcgt actgageggt 2344 



ggcgcaacaa tgaataaatg actgtgtacg gcctgtcctt cacaaeggat gggaccatca 2404 

aegtaa tga atg agg caa gac att atg tat aat att gat gat att ctg 2452 

* Met Arg Gin Asp lie Met Tyr Asn lie Asp Asp lie Leu 
145 150 155 

gag aaa gtg aat get cca cga gca cgc ctg tea gaa gaa aac gat aca 2 50 0 
Glu Lys Val Asn Ala Pro Arg Ala Arg Leu Ser Glu Glu Asn Asp Thr 
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160 165 170 

gcg gtg acg ctg acg gat tta ttc teg cgt teg ttt ccc gag gtc aaa 2548 

Ala Val Thr Leu Thr Asp Leu Phe Ser Arg Ser Phe Pro Glu Val Lys 
175 ' 180 185 

aaa ate act ggc gac age ctg tea tgg gga gag gtc tgc tat ctg tac 2596 

Lys lie Thr Gly Asp Ser Leu Ser Trp Gly Glu Val Cys Tyr Leu Tyr 
190 195 200 205 

agt cag gcg cag cac gaa cag aaa gaa aac egg etc ace gaa tec cgt 2 644 

Ser Gin Ala Gin His Glu Gin Lys Glu Asn Arg Leu Thr Glu Ser Arg 
210 215 220 

att ctg gee egg gcg aat ccc eta ctg gtg aat gee gtt cgc ctg gga 2692 

lie Leu Ala Arg Ala Asn Pro Leu Leu Val Asn Ala Val Arg Leu Gly 

225 230 235 

ata egg cag gca gee ggc agt cgc age tat gat gac tgg ttt ggc tec 2740 

lie Arg Gin Ala Ala Gly Ser Arg Ser Tyr Asp Asp Trp Phe Gly Ser 
240 245 250 

cgc gca gac cgt ttc gee cgc ccc ggc teg gtg gee tec atg ttc tea 2788 

Arg Ala Asp Arg Phe Ala Arg Pro Gly Ser Val Ala Ser Met Phe Ser 
255 260 265 

ccg gcg gcg tat ctg ace gag ctg tac cgt gag gcg aag gac ctg cat 2 836 

Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asp Leu His 
270 ^ 275 * 280 285 

ccg gac ace teg ctg ttc egg ctg gac ate egg cgt ccc gac ctg gcg 2 884 

Pro Asp Thr Ser Leu Phe Arg Leu Asp lie Arg Arg Pro Asp Leu Ala 
290 295 300 

gcg ctg gee ctt age cag aat aat atg gac gac gag etc tec ace ctg 2 932 

Ala Leu Ala Leu Ser Gin Asn Asn Met Asp Asp Glu Leu Ser Thr Leu 

305 310 315 

age ctg tec aat gag eta ctg tat cgc ggt ate ggg gca gcg gaa ggg 2 98 0 

Ser Leu Ser Asn Glu Leu Leu Tyr Arg Gly lie Gly Ala Ala Glu Gly 
320 325 330 

ctt gac gac gac age gtc agg gag ctg etc gee ggg tat cgc ctg ace 3 02 8 

Leu Asp Asp Asp Ser Val Arg Glu Leu Leu Ala Gly Tyr Arg Leu Thr 
335 340 345 

ggc ctg ace ccc tat cac tgg gcg tac gag gcg gee cgc caa gee att 3 076 

Gly Leu Thr Pro Tyr His Trp Ala Tyr Glu Ala Ala Arg Gin Ala lie 
350 355 360 365 

ctg gtg cag gac ccg acg ctg atg ggg ttc age cgt aat ccg gat gtg 3124 

Leu Val Gin Asp Pro Thr Leu Met Gly Phe Ser Arg Asn Pro Asp Val 
370 375 " 380 

gcg cag ctt atg gac cct gee tec atg ctg gee att gaa gee gat att 3172 

Ala Gin Leu Met Asp Pro Ala Ser Met Leu Ala lie Glu Ala Asp lie 

385 390 395 

tea ccg gag ctg tat cag ata ctg gee gaa gaa att acg aca gac agt 322 0 

Ser Pro Glu Leu Tyr Gin lie Leu Ala Glu Glu lie Thr Thr Asp Ser 
400 405 410 

tac gaa gca etc tgg agt aag aat ttt ggt gat atg cct ccc tec tea . 3268 

Tyr Glu Ala Leu Trp Ser Lys Asn Phe Gly Asp Met Pro Pro Ser Ser 
415 420 " 425 

ctg tta tct tat gat gca ctt gca aca ttt tat gat ctt gat tac gat 3316 



Leu Leu Ser Tyr Asp Ala Leu Ala Thr Phe Tyr Asp Leu Asp Tyr Asp 
430 435 440 445 



gag eta act teg tta ttg tea tta agg ctg gac ttt tea aat cca aac 
Glu Leu Thr Ser Leu Leu Ser Leu Arg Leu Asp Phe Ser Asn Pro Asn 
450 455 460 



3364 



aat gaa tac tac att aat agt caa tta agt gtc gta act ctg aat gaa 
Asn Glu Tyr Tyr lie Asn Ser Gin Leu Ser Val Val Thr Leu Asn Glu 
465 470 475 



3412 



age act ggt tta ata act ata cat cat tat tta aga acg eta ggc gga 
Ser Thr Gly Leu lie Thr lie His His Tyr Leu Arg Thr Leu Gly Gly 
480 485 490 



3460 



gac tea cag cag att aac cct gag ctt ata cct tat ggg gat gga aca 
Asp Ser Gin Gin lie Asn Pro Glu Leu lie Pro Tyr Gly Asp Gly Thr 
495 500 505 



3508 



tat ctt tat aat ttc age gtg gtg tea acg ata tea gag gat agt ttc 
Tyr Leu Tyr Asn Phe Ser Val Val Ser Thr lie Ser Glu Asp Ser Phe 
510 515 520 525 



3556 



aaa eta ggg teg tta ggt tct aac agt age aat ctt tac tct ggg gat 
Lys Leu Gly Ser Leu Gly Ser Asn Ser Ser Asn Leu Tyr Ser Gly Asp 
530 ~ 535 * 540 



3604 



tat cag ctt caa aaa ggg gtt cgc tat age att cct gtt gaa ata gat 
Tyr Gin Leu Gin Lys Gly Val Arg Tyr Ser lie Pro Val Glu lie Asp 
545 550 555 



3652 



gaa gga aag tta aat gat ggg ate aca ata gga ttg agt agg aaa ggg 
Glu Gly Lys Leu Asn Asp Gly lie Thr lie Gly Leu Ser Arg Lys Gly 
560 " 565 570 



3700 



999 99^- tat tac tea aca gta aac ttc act ctg att gaa tat gat cct 
Gly Gly Tyr Tyr Ser Thr Val Asn Phe Thr Leu lie Glu Tyr Asp Pro 
575 580 585 



3748 



gcg ata ttc att ctt aaa tta aat aaa gtt ate cgc eta tac aag gec 
Ala lie Phe lie Leu Lys Leu Asn Lys Val lie Arg Leu Tyr Lys Ala 
590 595 ~ 600 605 



3796 



acg ggc atg ace acg gcg gaa ata tat caa ate acc aat att ctt aat 
Thr Gly Met Thr Thr Ala Glu lie Tyr Gin lie Thr Asn lie Leu Asn 
610 615 620 



3844 



aac ggt etc acc att gac cat gcg gtc ctg agt aaa ate ttc ctg gtc 
Asn Gly Leu Thr lie Asp His Ala Val Leu Ser Lys lie Phe Leu Val 
625 630 635 



3892 



cgt tac ctg atg cgt cac tat cag ctt gat gtg gee egg tea ctg ata 
Arg Tyr Leu Met Arg His Tyr Gin Leu Asp Val Ala Arg Ser Leu lie 
640 " ^ 645 " 650 



3940 



ttg tgc aac gga acc ate agt gac cag gcg ttc age ggc gaa acc ggc 
Leu Cys Asn Gly Thr lie Ser Asp Gin Ala Phe Ser Gly Glu Thr Gly 
655 " 660 ' 665 



3988 



ctg ttc acc acg ctg ttc aac acc cca ccg ctg aac ggc cag ctg ttt 
Leu Phe Thr Thr Leu Phe Asn Thr Pro Pro Leu Asn Gly Gin Leu Phe 
670 675 680 685 



4036 



tct gca gat gat acc ccc etc gac tta cgc tct gaa gca ccg gag gat 
Ser Ala Asp Asp Thr Pro Leu Asp Leu Arg Ser Glu Ala Pro Glu Asp 
690 695 700 



4084 



4 



get ttc cgt etc age gta ctg aaa cgc gca ttt aac ate age gee teg 4132 

Ala Phe Arg Leu Ser Val Leu Lys Arg Ala Phe Asn lie Ser Ala Ser 
705 710 715 

ggg ctt tec acg etc tgg cag ttg gee age ggt gac age age get ggg 4180 

Gly Leu Ser Thr Leu Trp Gin Leu Ala Ser Gly Asp Ser Ser Ala Gly 
720 725 730 

ttt age tgc tct get gac aat ate gec gca etc tac cga gtg aaa etc 4228 

Phe Ser Cys Ser Ala Asp Asn lie Ala Ala Leu Tyr Arg Val Lys Leu 
735 ~ * 740 745 

ctg get gac ate cac gac eta tec get ggt gag ctg tea atg ttg ctg 4276 

Leu Ala Asp lie His Asp Leu Ser Ala Gly Glu Leu Ser Met Leu Leu 

750 755 760 765 

tec gtc tec cct ttc age ggg gtg gee gee ggc teg ctg tec gat aat 4324 

Ser Val Ser Pro Phe Ser Gly Val Ala Ala Gly Ser Leu Ser Asp Asn 

770 " 775 ~ 780 

gag ctg acg cag ttt ctg tac cag ace ace ace tgg etc acg gag cag 4372 

Glu Leu Thr Gin Phe Leu Tyr Gin Thr Thr Thr Trp Leu Thr Glu Gin 
785 790 795 

ggc tgg acg gtc age gat gtg ttc ctg atg ctg acg acg cag tac ggt 442 0 

Gly Trp Thr Val Ser Asp Val Phe Leu Met Leu Thr Thr Gin Tyr Gly 
800 805 810 

ace ctg ctg ace ccc gac att gag aac ctg etc get tec ctg cgc aac 4468 

Thr Leu Leu Thr Pro Asp lie Glu Asn Leu Leu Ala Ser Leu Arg Asn 
815 820 825 

gga ctg teg ggc cgt gag ctg ttc ccg gaa acg etc ccc ggc gat ggc 4516 

Gly Leu Ser Gly Arg Glu Leu Phe Pro Glu Thr Leu Pro Gly Asp Gly 

830 835 840 845 

get ccc ttt att gee gee gee atg cag ctg gac gee acg gat acg gcg 4564 

Ala Pro Phe lie Ala Ala Ala Met Gin Leu Asp Ala Thr Asp Thr Ala 

850 855 860 

aag gcg atg ctg act tgg gcg gac cag ttg aag cca gag ggg ctg acg 4612 

Lys Ala Met Leu Thr Trp Ala Asp Gin Leu Lys Pro Glu Gly Leu Thr 
865 870 875 

ctg acg gaa ttt att ctt ttg gtg atg aat gee gee cca aat gac gag 4660 

Leu Thr Glu Phe lie Leu Leu Val Met Asn Ala Ala Pro Asn Asp Glu 
880 885 890 

cag gcg ggc cag atg gca ggg ttc tgc caa gee ctg tgg caa ctg gca 470 8 

Gin Ala Gly Gin Met Ala Gly Phe Cys Gin Ala Leu Trp Gin Leu Ala 
895 900 905 

ctg ate ate cgc age acc ggc etc age acg cgc gag ctg acg ctg ctg 4756 

Leu lie lie Arg Ser Thr Gly Leu Ser Thr Arg Glu Leu Thr Leu Leu 

910 915 920 925 

gtc age cag ccg gga cgc ttc cgc aca gga tgg cac cat ctg ccc cat 4 8 04 

Val Ser Gin Pro Gly Arg Phe Arg Thr Gly Trp His His Leu Pro His 

930 935 940 

gac ctg ccg gcg ctt cgc gac att acg cgt ttt cat gee gtc gtt aac 4852 

Asp Leu Pro Ala Leu Arg Asp lie Thr Arg Phe His Ala Val Val Asn 
945 950 ~ 955 

cgc age ggc age cat gee ggg gag gtc ctg acc gca ctt gag acc gga 4 900 

Arg Ser Gly Ser His Ala Gly Glu Val Leu Thr Ala Leu Glu Thr Gly 
960 965 970 
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gaa ctg teg tea gec ctg ctg gec egg gec ctg tea cag aat gag cag 494 8 
Glu Leu Ser Ser Ala Leu Leu Ala Arg Ala Leu Ser Gin Asn Glu Gin 
975 980 985 

gat gtg ace ggc gee ttg gcg cag gtg agg ggg gee ggt gaa cag gac 4 996 
Asp Val Thr Gly Ala Leu Ala Gin Val Arg Gly Ala Gly Glu Gin Asp 
990 995 1000 1005 

aac age gtg ttc acc tec tgg gaa gag gtg gac cag get gag cag tgg 5 044 
Asn Ser Val Phe Thr Ser Trp Glu Glu Val Asp Gin Ala Glu Gin Trp 
1010 1015 1020 

ctg gac atg agt gag acc ctg tec att acg cca tec ggt ctg get age 5092 
Leu Asp Met Ser Glu Thr Leu Ser lie Thr Pro Ser Gly Leu Ala Ser 
1025 1030 1035 

ctg att gee ctg aag tac ate aat gtg tec gat gac agt gca ccg ttg 514 0 
Leu lie Ala Leu Lys Tyr lie Asn Val Ser Asp Asp Ser Ala Pro Leu 
1040 '* 1045 1050 

tac age cag tgg cag gtg gta tec ggt ctg ctg cag gee ggg ctg aaa 5188 
Tyr Ser Gin Trp Gin Val Val Ser Gly Leu Leu Gin Ala Gly Leu Lys 
1055 1060 1065 

age age cag age teg gcg ctg cac gat tat ctg gag gag ggg acc age 52 3 6 
Ser Ser Gin Ser Ser Ala Leu His Asp Tyr Leu Glu Glu Gly Thr Ser 
1070 1075 1080 1085 

age gee ctt tgt gcg tat tat ctg cgt aat ctg gca ccg aac atg gta 52 84 
Ser Ala Leu Cys Ala Tyr Tyr Leu Arg Asn Leu Ala Pro Asn Met Val 
1090 * 1095 1100 

tec ggg cgc gat gac etc ttc ggg tat ctg ctg ctg gat aat cag gtg 5332 
Ser Gly Arg Asp Asp Leu Phe Gly Tyr Leu Leu Leu Asp Asn Gin Val 
1105 1110 1115 

tea gec aag gta aaa acc acc cgc att gcg gag gee ate gec ggc ata 5380 
Ser Ala Lys Val Lys Thr Thr Arg lie Ala Glu Ala He Ala Gly lie 
1120 1125 1130 

egg ctg tat ate aac egg gec ctt aac gga ata gaa etc age gec atg 542 8 
Arg Leu Tyr lie Asn Arg Ala Leu Asn Gly He Glu Leu Ser Ala Met 
1135 1140 1145 

gca gag gtg agg ggg cgt cag ttt ttc act gac tgg gat acg ttc aac 54 76 
Ala Glu Val Arg Gly Arg Gin Phe Phe Thr Asp Trp Asp Thr Phe Asn 
1150 1155 1160 * 1165 

aaa cgt tac age acc tgg gcg ggc gtc tea gag ctg gtt tac tat ccg 5524 
Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr Tyr Pro 
1170 " 1175 " 1180 

gaa aac tac etc gac ccg acg gtc cgt ate ggg cag acc ggc atg atg 5572 
Glu Asn Tyr Leu Asp Pro Thr Val Arg lie Gly Gin Thr Gly Met Met 
1185 1190 1195 

gac acc ctg ctg cag tct gtc age cag age agt ate aac cgc gat acc 562 0 
Asp Thr Leu Leu Gin Ser Val Ser Gin Ser Ser He Asn Arg Asp Thr 
1200 1205 1210 

gtg gag gat gec ttt aaa acc tat ctg acc acg ttt gag cag att gee 5668 
Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Thr Phe Glu Gin He Ala 
1215 1220 1225 

aat ctg aac act gtc age gga tat cac gat aac gee age atg acg cag 5716 
Asn Leu Asn Thr Val Ser Gly Tyr His Asp Asn Ala Ser Met Thr Gin 
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1230 1235 1240 1245 

ggg act aca tgg tat gtg ggt cgc age ate aca gat cag act aac tgg 57 64 

Gly Thr Thr Trp Tyr Val Gly Arg Ser lie Thr Asp Gin Thr Asn Trp 

1250 1255 1260 

tac tgg cgc age gec aac cac age aaa ate caa gac tea atg atg ccc 5812 

Tyr Trp Arg Ser Ala Asn His Ser Lys lie Gin Asp Ser Met Met Pro 
1265 1270 1275 

gcg aat gee tgg ace gga tgg aca aaa att aac tgc gga atg aat ccg 5860 

Ala Asn Ala Trp Thr Gly Trp Thr Lys lie Asn Cys Gly Met Asn Pro 
1280 ~ ~ 1285 1290 

tgg tea gat ctt gtg tgc teg gtg ttt ttc aac agt cgc ctt tat gtc 5908 

Trp Ser Asp Leu Val Cys Ser Val Phe Phe Asn Ser Arg Leu Tyr Val 
1295 ~ 1300 1305 

gtc tgg gtc gaa gag aat cag tct get gat acg gag gca gag age acg 5 95 6 

Val Trp Val Glu Glu Asn Gin Ser Ala Asp Thr Glu Ala Glu Ser Thr 
1310 1315 1320 1325 

aca ace acg cag cag age tac acg ctg aaa ctg teg ttc egg cgc tac 6004 

Thr Thr Thr Gin Gin Ser Tyr Thr Leu Lys Leu Ser Phe Arg Arg Tyr 

1330 1335 1340 

gac ggt aca tgg agt tec ccg gtg teg ttc gac att acc ggc aac ate 6052 

Asp Gly Thr Trp Ser Ser Pro Val Ser Phe Asp lie Thr Gly Asn lie 
1345 1350 1355 

gca ttt ccg gaa acg cag ggc atg cat gtg acc tgt aat ccc ctg act 6100 

Ala Phe Pro Glu Thr Gin Gly Met His Val Thr Cys Asn Pro Leu Thr 
1360 1365 1370 

gag cag etc tat tgc gcg ttt tac tec gtc acc age aag ccg gac ttt 6148 

Glu Gin Leu Tyr Cys Ala Phe Tyr Ser Val Thr Ser Lys Pro Asp Phe 
1375 " 1380 ^ 1385 

gat aac get cag ctg att tct gtg gat aat gat atg acg eta aat gtc 6196 

Asp Asn Ala Gin Leu lie Ser Val Asp Asn Asp Met Thr Leu Asn Val 
1390 1395 1400 1405 

ate tea gat ata ggg att ttt aag age gtc agt cac gaa ttt aat acg 6244 

lie Ser Asp lie Gly lie Phe Lys Ser Val Ser His Glu Phe Asn Thr 

1410 ~ 1415 1420 

age act gag aaa ttt att aat aat gtt ttt tea gac cct tec get aat 6292 

Ser Thr Glu Lys Phe lie Asn Asn Val Phe Ser Asp Pro Ser Ala Asn 
1425 1430 1435 

tat ttt gtc agt gca acg agt tta att gat gat gtt ate cac age gat 6340 

Tyr Phe Val Ser Ala Thr Ser Leu lie Asp Asp Val lie His Ser Asp 
1440 1445 1450 

ttc tea etc ctt aat tct aaa act aca agt act gtt ttt act aat gaa 6388 

Phe Ser Leu Leu Asn Ser Lys Thr Thr Ser Thr Val Phe Thr Asn Glu 
1455 1460 1465 

gat tec tct ctt ttg acg cca gag ctt cat att aca gca aat gtt teg 6436 

Asp Ser Ser Leu Leu Thr Pro Glu Leu His lie Thr Ala Asn Val Ser 
1470 1475 1480 1485 

tgt ttt gtt agt act get ggc ate gec act caa tct acc ata gaa aaa 64 84 

Cys Phe Val Ser Thr Ala Gly lie Ala Thr Gin Ser Thr lie Glu Lys 

1490 1495 1500 

ttc gtt cag gca ggg ata gaa ttt gag gaa att aat ttt tat gca ggc 6532 
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Phe Val Gin Ala Gly lie Glu Phe Glu Glu lie Asn Phe Tyr Ala Gly 
1505 1510 1515 

cag gcc gcc ggc gga ttt gac gga ttt gtg gga gtg gat gtt tct aat 6580 
Gin Ala Ala Gly Gly Phe Asp Gly Phe Val Gly Val Asp Val Ser Asn 
1520 1525 -1530 

tea aaa gta tac cag gtc gga aaa gaa gca gtt ggt gtc act gta aaa 662 8 
Ser Lys Val Tyr Gin Val Gly Lys Glu Ala Val Gly Val Thr Val Lys 
1535 1540 1545 

tct tat tec gtc act ggc gtt agt ggt tct gtt gag tta ttt att gat 6676 
Ser Tyr Ser Val Thr Gly Val Ser Gly Ser Val Glu Leu Phe lie Asp 
1550 * 1555 " 1560 1565 

tea tea aat aaa tac ttc age gga att ttg tea gat aaa atg ata acc 6724 
Ser Ser Asn Lys Tyr Phe Ser Gly lie Leu Ser Asp Lys Met lie Thr 
1570 1575 1580 

get tta att age ggc agt aca tea aaa gtt aat tac gtg teg. tct att 6772 
Ala Leu lie Ser Gly Ser Thr Ser Lys Val Asn Tyr Val Ser Ser lie 
1585 1590 1595 

ggc tct caa gat ttt tgg agt gta aag teg etc atg ccg gca ctt cag 6820 
Gly Ser Gin Asp Phe Trp Ser Val Lys Ser Leu Met Pro Ala Leu Gin 
1600 1605 1610 

ata tat gaa tta ate gat gat ate ata ctg aca tec ggc gta aat ggg 6868 
lie Tyr Glu Leu lie Asp Asp lie lie Leu Thr Ser Gly Val Asn Gly 
1615 1620 1625 

act gaa att aaa tec tgg cct tec get gaa tgg tat aat gat aag ctg 6916 
Thr Glu lie Lys Ser Trp Pro Ser Ala Glu Trp Tyr Asn Asp Lys Leu 
1630 1635 1640 * 1645 

agt ctg caa tec ggg aat aat ctt ttc aac acc aaa teg ctg agt ttt 6964 
Ser Leu Gin Ser Gly Asn Asn Leu Phe Asn Thr Lys Ser Leu Ser Phe 
1650 1655 ~ 1660 

acc gtt aat acc agt gat att gtt gaa gat gag ttt gac gtg acg ttt 7012 
Thr Val Asn Thr Ser Asp lie Val Glu Asp Glu Phe Asp Val Thr Phe 
1665 1670 1675 

acg ttc acc get gtc gat cag aat aac gtc gtg ctg gcc gcc egg acg 7 060 
Thr Phe Thr Ala Val Asp Gin Asn Asn Val Val Leu Ala Ala Arg Thr 
1680 1685 1690 

gcc ata tta acc gtc att cga aac att aat aat gac act tec gtt ate 7108 
Ala lie Leu Thr Val lie Arg Asn lie Asn Asn Asp Thr Ser Val lie 
1695 1700 1705 

gca tta cgt aaa aat acg cgt ggc gcg cag tat att cgt ttc act gcg 7156 
Ala Leu Arg Lys Asn Thr Arg Gly Ala Gin Tyr lie Arg Phe Thr Ala 
1710 1715 1720 1725 

ggt aac gat gtg gcg ctt att cgc etc aac acc etc ttt gcc cgc caa 7204 
Gly Asn Asp Val Ala Leu lie Arg Leu Asn Thr Leu Phe Ala Arg Gin 
1730 1735 1740 

ct 9 gtc gac egg gcg aat acc ggg att gac acc att ctt tec atg gag 7252 
Leu Val Asp Arg Ala Asn Thr Gly lie Asp Thr lie Leu Ser Met Glu 
1745 1750 1755 

acc cag agg ctt acc gaa ccc gcc ctg gaa gag ggg agt gat gtg ttt 73 00 
Thr Gin Arg Leu Thr Glu Pro Ala Leu Glu Glu Gly Ser Asp Val Phe 
1760 1765 1770 
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atg gac tte tec gga gec aat gec etc tat ttc tgg gag ctg ttc tat 7348 
Met Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe Tyr 
1775 1780 1785 

tac acg ccg atg atg gtg ttc cag egg ttg ttg cag gaa cag cac ttc 7396 
Tyr Thr Pro Met Met Val Phe Gin Arg Leu Leu Gin Glu Gin His Phe 
1790 1795 1800 1805 

ccg gaa gee ace cgc tgg ctg cag tat gtc tgg aac ccg gee ggg cac 7444 
Pro Glu Ala Thr Arg Trp Leu Gin Tyr Val Trp Asn Pro Ala Gly His 
1810 1815 1820 

gtg gta aac ggg gtg ctg cag aat tac acc tgg aat gtc cgt ccg ctg 7492 
Val Val Asn Gly Val Leu Gin Asn Tyr Thr Trp Asn Val Arg Pro Leu 
1825 1830 1835 

gag gag gac acc ggc tgg aac gac teg ccg ctg gac tec att gac ccc 754 0 
Glu Glu Asp Thr Gly Trp Asn Asp Ser Pro Leu Asp Ser lie Asp Pro 
1840 ~ 1845 1850 

gat gca ata gee cag tac gac ccc atg cat tac aag gtc gee acc ttt 7588 
Asp Ala lie Ala Gin Tyr Asp Pro Met His Tyr Lys Val Ala Thr Phe 
1855 1860 1865 

atg teg tac etc gac ctg ctg att gee cgc ggt gat gee gee tac egg 7636 
Met Ser Tyr Leu Asp Leu Leu lie Ala Arg Gly Asp Ala Ala Tyr Arg 
1870 1875 1880 1885 

ctg etc gag egg gac acc ctt aac gag gee egg atg tgg tac gtc cag 7684 
Leu Leu Glu Arg Asp Thr Leu Asn Glu Ala Arg Met Trp Tyr Val Gin 
1890 1895 1900 

gee ctg aac ctt ctg ggc gac gag ccc tat att tec ttt gac gee gac 7732 
Ala Leu Asn Leu Leu Gly Asp Glu Pro Tyr lie Ser Phe Asp Ala Asp 
1905 1910 1915 

tgg teg gcg ttg acc ctg ggt gac gca gee age gag gtg acg cga cgc 7780 
Trp Ser Ala Leu Thr Leu Gly Asp Ala Ala Ser Glu Val Thr Arg Arg 
1920 1925 1930 

gat tac cag gag gee ctg ctg gee gtg cgc egg ttg gtg ccc get ccc 782 8 
Asp Tyr Gin Glu Ala Leu Leu Ala Val Arg Arg Leu Val Pro Ala Pro 
1935 1940 1945 

gag aca egg acg gcg aat tec ctg acg gca ctg ttc etc ccg cag cag 7876 
Glu Thr Arg Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Gin 
1950 ~ 1955 1960 1965 

aac gag gtg etc aaa ggc tac tgg caa acc ttg gca cag egg etc cat 7924 
Asn Glu Val Leu Lys Gly Tyr Trp Gin Thr Leu Ala Gin Arg Leu His 
1970 1975 1980 

aac ctg cgc cac aac etc tec att gac ggc cag ccg ctt tec ctg tec 7972 
Asn Leu Arg His Asn Leu Ser lie Asp Gly Gin Pro Leu Ser Leu Ser 
1985 1990 1995 

gtc tac gee acg ccg tec gaa ccg tec gee ctg cag agt gec gtc gtc 802 0 
Val Tyr Ala Thr Pro Ser Glu Pro Ser Ala Leu Gin Ser Ala Val Val 
2000 2005 2010 

aac age gcg cag ggt get gca gca ctg ccg gee gcg gtg atg ccg ctt 8068 
Asn Ser Ala Gin Gly Ala Ala Ala Leu Pro Ala Ala Val Met Pro Leu 
2015 2020 2025 

tac agt ttc ccg gtc atg ctg gag aac gee egg ggg atg gtg age ctg 8116 
Tyr Ser Phe Pro Val Met Leu Glu Asn Ala Arg Gly Met Val Ser Leu 
2030 2035 2040 2045 
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ctg acc ggg ttc ggc aac aca ctg etc ggt att ace gag cgt cag gat 8164 

Leu Thr Gly Phe Gly Asn Thr Leu Leu Gly lie Thr Glu Arg Gin Asp 

2050 2055 2060 

gcg gag gcg ctg gec aaa ctg ctg cag acc cag ggc agt gaa ctg ata 8212 

Ala Glu Ala Leu Ala Lys Leu Leu Gin Thr Gin Gly Ser Glu Leu lie 

2065 ^ 2070 2075 

cgc cag ggc ctt cgc cag cag gat aac gtc etc gag gaa ate gat gcg 8260 

Arg Gin Gly Leu Arg Gin Gin Asp Asn Val Leu Glu Glu lie Asp Ala 

2080 " 2085 2090 

gat att gec gec ctg gag gag age cgc cgc ggc gcg cag atg cgt ttt 83 0 8 

Asp lie Ala Ala Leu Glu Glu Ser Arg Arg Gly Ala Gin Met Arg Phe 

2095 2100 ~ ~ 2105 

gaa cgt tac aaa gtg ttg tac gag gcg gac gtc aac acc ggc gaa aaa 8356 

Glu Arg Tyr Lys Val Leu Tyr Glu Ala Asp Val Asn Thr Gly Glu Lys 

2110 ~ "* 2115 ' 2120 2125 

cag gee atg gac ttg tac etc agt teg tec gtg ctg teg gca tea acc 8404 

Gin Ala Met Asp Leu Tyr Leu Ser Ser Ser Val Leu Ser Ala Ser Thr 

2130 2135 2140 

gec gcg etc ttt ttg gee gag gee gcg gee gat atg ctg ccc aat att 8452 

Ala Ala Leu Phe Leu Ala Glu Ala Ala Ala Asp Met Leu Pro Asn lie 

2145 2150 2155 

tac ggg ctg gee gtc ggg ggc tec cgc tat ggg gca eta ttt aaa gec 8500 

Tyr Gly Leu Ala Val Gly Gly Ser Arg Tyr Gly Ala Leu Phe Lys Ala 

2160 2165 2170 

acc gee ate ggc ate cag gtg tec tec gat gee acc cgc ata tea gcg 8548 

Thr Ala lie Gly lie Gin Val Ser Ser Asp Ala Thr Arg lie Ser Ala 

2175 ~ 2180 2185 

gac aaa ate age cag teg gaa gtg tac cgc cgt cgc egg gag gag tgg 8596 

Asp Lys lie Ser Gin Ser Glu Val Tyr Arg Arg Arg Arg Glu Glu Trp 

2190 ~ 2195 2200 2205 

gaa ate cag cgt gat agt gcg cag tct gac gtg gcg cag att gat gee 8644 

Glu lie Gin Arg Asp Ser Ala Gin Ser Asp Val Ala Gin lie Asp Ala 

2210 2215 2220 

cag ctg gcg gee atg gca gtg cgc egg gaa ggg get gag ctg cag aaa 86 92 

Gin Leu Ala Ala Met Ala Val Arg Arg Glu Gly Ala Glu Leu Gin Lys 

2225 2230 2235 

act tac ctt gag acc cag cag acc cag gca cag gcg cag ttg gca ttc 874 0 

Thr Tyr Leu Glu Thr Gin Gin Thr Gin Ala Gin Ala Gin Leu Ala Phe 

2240 2245 2250 

ctg cag agt aag ttc aac aat acg get ctg tac age tgg ctg egg ggc 8788 

Leu Gin Ser Lys Phe Asn Asn Thr Ala Leu Tyr Ser Trp Leu Arg Gly 

2255 2260 " 2265 

agg ttg tec gee att tat tac cag ttc tat gac ctg gca gta tec cgc 8836 

Arg Leu Ser Ala lie Tyr Tyr Gin Phe Tyr Asp Leu Ala Val Ser Arg 

2270 2275 " 2280 2285 

tgc ctg atg gcg caa cag gee tgg cag tgg gat aaa ttc gag act agg 8884 

Cys Leu Met Ala Gin Gin Ala Trp Gin Trp Asp Lys Phe Glu Thr Arg 

2290 2295 2300 

teg ttt ate cag ccg ggg gee tgg atg ggg gca aat gee ggt ctg ctg 8932 
Ser Phe lie Gin Pro Gly Ala Trp Met Gly Ala Asn Ala Gly Leu Leu 
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2305 2310 2315 

gcc ggg gaa acc ctg atg ctg aat ctg gcg cag atg gag cag gcc tgg 

Ala Gly Glu Thr Leu Met Leu Asn Leu Ala Gin Met Glu Gin Ala Trp 
2320 2325 2330 



8980 



ctg acg ggg gat gag egg gca ata gag gtg acg egg acg gtc tgc ctg 902 8 
Leu Thr Gly Asp Glu Arg Ala lie Glu Val Thr Arg Thr Val Cys Leu 
2335 " 2340 2345 

teg gag gtc tat acc age etc gcg gag gat gcg gca ttc tct ctg gcc 9076 
Ser Glu Val Tyr Thr Ser Leu Ala Glu Asp Ala Ala Phe Ser Leu Ala 
2350 2355 2360 2365 

gac aag gtg gtg gaa ctg gtc agt aac ggt teg ggc agt gcg ggt acg 9124 
Asp Lys Val Val Glu Leu Val Ser Asn Gly Ser Gly Ser Ala Gly Thr 
2370 2375 2380 

aaa age aac gga tta cag atg gat caa cag caa etc gag gcc acc ctg 9172 
Lys Ser Asn Gly Leu Gin Met Asp Gin Gin Gin Leu Glu Ala Thr Leu 
2385 2390 2395 

aaa ctg get gac etc ggt ate ggc aac gat tac ccg gtc tec ctt ggc 9220 
Lys Leu Ala Asp Leu Gly lie Gly Asn Asp Tyr Pro Val Ser Leu Gly 
2400 2405 2410 

acc atg agg cgc ate aaa caa ata age gtc acg etc ccg gcg ctg gtc 9268 
Thr Met Arg Arg lie Lys Gin lie Ser Val Thr Leu Pro Ala Leu Val 
2415 ' 2420 2425 

ggc ccc tat cag gac gtc cgt gcg gtt etc age tac ggc gga agt atg 9316 
Gly Pro Tyr Gin Asp Val Arg Ala Val Leu Ser Tyr Gly Gly Ser Met 
243 0 24 3 5 244 0 2445 

gtc atg ccc egg ggt tgc age gcg ctg gcg gtc tea cac gga atg aac 9364 
Val Met Pro Arg Gly Cys Ser Ala Leu Ala Val Ser His Gly Met Asn 
2450 2455 2460 

gac age ggc caa ttc caa ctg gat ttc aat gac ccg cgt tac ctg ccg 9412 
Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn Asp Pro Arg Tyr Leu Pro 
2465 2470 2475 

ttt gaa gga ctt cca gtt gat gac aca ggg acc ctg aca ctg age ttc 9460 
Phe Glu Gly Leu Pro Val Asp Asp Thr Gly Thr Leu Thr Leu Ser Phe 
2480 2485 2490 

ccg gat get gac ggc aaa caa cag gcg atg etc etc agt ctg age gac 95 0 8 
Pro Asp Ala Asp Gly Lys Gin Gin Ala Met Leu Leu Ser Leu Ser Asp 
2495 ~ " 2500 2505 

ate ate ctg cat ate cgt tac acc att ate age tga tag gtatcaacat 9557 
lie lie Leu His lie Arg Tyr Thr lie lie Ser * * 
2510 2515 2520 

agcgcaggcc cccgaacgag ggectgegag gagactgagc atg caa aat cat caa 9612 

Met Gin Asn His Gin 
2525 

gac atg gcc att act gcc ccc acg ttg cct tec ggg ggc ggt gcg gtc 9660 
Asp Met Ala lie Thr Ala Pro Thr Leu Pro Ser Gly Gly Gly Ala Val 
2530 2535 2540 

acc ggg etc aag ggt gat ate gcg gcg gca ggg ccg gat ggt gcg gcg 9708 
Thr Gly Leu Lys Gly Asp lie Ala Ala Ala Gly Pro Asp Gly Ala Ala 
2545 2550 2555 

acc ctg agt att ccc ttg ccg gtt age ccc ggt egg ggt tac gcc ccc 9756 
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Thr Leu Ser lie Pro Leu Pro Val Ser Pro Gly Arg Gly Tyr Ala Pro 

2560 2565 2570 

act ggg gca ctt aat tat cac age egg teg ggg aac ggc ccc ttt ggc 9804 

Thr Gly Ala Leu Asn Tyr His Ser Arg Ser Gly Asn Gly Pro Phe Gly 

2575 2580 2585 

att ggc tgg ggt ate ggc ggt get get gtc cag cgt cgt acg cgc aac 9 852 

lie Gly Trp Gly lie Gly Gly Ala Ala Val Gin Arg Arg Thr Arg Asn 

2590 ~ 2595 2600 2605 

gga gca cct acc tac gat gat act gat gaa ttc acc ggt ccg gac ggt 990 0 

Gly Ala Pro Thr Tyr Asp Asp Thr Asp Glu Phe Thr Gly Pro Asp Gly 

2610 2615 2620 

gag gtg ctg gtg ccg gca etc acg get get ggc acc caa gaa gca egg 9948 

Glu Val Leu Val Pro Ala Leu Thr Ala Ala Gly Thr Gin Glu Ala Arg 

2625 2630 2635 

cag gee acc tea eta ctg ggg ata aac cca ggc gga age ttc aac gtt 9996 

Gin Ala Thr Ser Leu Leu Gly lie Asn Pro Gly Gly Ser Phe Asn Val 

2640 ~ 2645 * 2650 

cag gtt tac cgt tea cgt acg gag ggt agt etc age cgc ctt gag cgt 10044 

Gin Val Tyr Arg Ser Arg Thr Glu Gly Ser Leu Ser Arg Leu Glu Arg 

2655 ' ~ 2660 2665 

tgg ctg ccc gec gac gag aca gaa acg gaa ttt tgg gtg tta tat ace 10092 

Trp Leu Pro Ala Asp Glu Thr Glu Thr Glu Phe Trp Val Leu Tyr Thr 

2670 2675 2680 2685 

cct gac gga cag gtg get ctg ctg ggc cga aat gcg cag get cgc ate 1014 0 

Pro Asp Gly Gin Val Ala Leu Leu Gly Arg Asn Ala Gin Ala Arg lie 

2690 2695 2700 

age aac ccc aca gee cca aca cag acg gcg gtt tgg ctg atg gag tec 10188 

Ser Asn Pro Thr Ala Pro Thr Gin Thr Ala Val Trp Leu Met Glu Ser 

2705 2710 2715 

teg gta tea ctt acc ggc gaa cag atg tat tac caa tac cgt gcg gaa 10236 

Ser Val Ser Leu Thr Gly Glu Gin Met Tyr Tyr Gin Tyr Arg Ala Glu 

2720 2725 2730 

gat gat gac ggt tgt gac gag gcg gag cgc gac gcg cac ccg cag gee 102 84 

Asp Asp Asp Gly Cys Asp Glu Ala Glu Arg Asp Ala His Pro Gin Ala 

2735 2740 2745 

ggc gee caa cgt tat ccg gtg gcg gtc tgg tat ggt aac cgt cag gcg 103 3 2 

Gly Ala Gin Arg Tyr Pro Val Ala Val Trp Tyr Gly Asn Arg Gin Ala 

2750 2755 2760 2765 

get egg acg eta ccg gcg ctg gtg teg aca cca tea atg gat age tgg 103 80 

Ala Arg Thr Leu Pro Ala Leu Val Ser Thr Pro Ser Met Asp Ser Trp 

2770 2775 2780 

ctg ttt ate ctg gtg ttt gat tat ggt gag cgt age teg gtg ctg tct 10428 

Leu Phe lie Leu Val Phe Asp Tyr Gly Glu Arg Ser Ser Val Leu Ser 

2785 2790 2795 

gaa gcg ccg gee tgg caa aca cca gga agt ggg gag tgg ctg tgt cgt 104 76 

Glu Ala Pro Ala Trp Gin Thr Pro Gly Ser Gly Glu Trp Leu Cys Arg 

2800 2805 2810 

cag gat tgt ttt tec ggg tat gag ttt ggt ttt aac ctg egg act cgc 10524 

Gin Asp Cys Phe Ser Gly Tyr Glu Phe Gly Phe Asn Leu Arg Thr Arg 

2815 ^ " 2820 2825 
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cgc etg 
Arg Leu 
2830 


tgc 
Cys 


cgt 
Arg 


cag 
Gin 


gtt ttg 
Val Leu 
2835 


atg 
Met 


ttc 
Phe 


cat 
His 


tac 
Tyr 
284C 


eta 
Leu 

I 


ggt 
Gly 


gtt 
Val 


etg 
Leu 


gcg 
Ala 
2845 


10572 


ggg 

Gly 


agt 
Ser 


teg 
Ser 


gga gcg aat 
Gly Ala Asn 
2850 


gat 
Asp 


gcg 
Ala 


cca 
Pro 


gca ttg 
Ala Leu 
2855 


att 
He 


tct 
Ser 


cgc 
Arg 


etg ttg 
Leu Leu 
2860 


10620 


etg 
Leu 


gac 
Asp 


tac 
Tyr 


agg gaa 
Arg Glu 
2865 


agt 
Ser 


cct 
Pro 


tea 
Ser 


etc agt 
Leu Ser 
2870 


etg 
Leu 


etc 
Leu 


gag 
Glu 


aac gtg 
Asn Val 
2875 


cac 
His 


10668 


cag 
Gin 


gtg 
Val 


get tat 
Ala Tyr 
2880 


gag 
Glu 


teg 

Ser 


gac 
Asp 


ggg acg 
Gly Thr 
2885 


tct 
Ser 


tgt 

Cys 


gee 
Ala 


ttg ccg 
Leu Pro 
2890 


gca 
Ala 


etg 
Leu 


10716 


gca 
Ala 


ttg ggg 
Leu Gly 
2895 


tgg 

Trp 


caa 
Gin 


ace 
Thr 


ttt ace 
Phe Thr 
2900 


ccg 
Pro 


ccg 
Pro 


aca 
Thr 


ttg teg 
Leu Ser 
2905 


gca 
Ala 


tgg 
Trp 


cag 
Gin 


10764 


acg cgt 
Thr Arg 
2910 


gac 
Asp 


gat 
Asp 


atg 
Met 


ggc aag 
Gly Lys 
2915 


ttg 
Leu 


agt 
Ser 


ttg 
Leu 


ctt caa 
Leu Gin 
2920 


ccc 
Pro 


tat 
Tyr 


cag 
Gin 


ctt 
Leu 
2925 


10812 


gta 
Val 


gac 
Asp 


ctt 
Leu 


aac 
Asn 


ggc gaa 
Gly Glu 
2930 


ggt 
Gly 


gtg 
Val 


gtg 
Val 


ggt ate 
Gly He 
2935 


etg 
Leu 


tat 
Tyr 


cag 
Gin 


gac age 
Asp Ser 
2940 


10860 


ggt 
Gly 


gee 
Ala 


tgg 
Trp 


tgg tac 
Trp Tyr 
2945 


cgt 
Arg 


gaa 
Glu 


ccg 
Pro 


gta cgc 
Val Arg 
2950 


cag 
Gin 


teg 
Ser 


ggg 

Gly 


gat gat 
Asp Asp 
2955 


ccg 
Pro 


10908 


gat 
Asp 


get 
Ala 


gtg ace 
Val Thr 
2960 


tgg 
Trp 


ggg 

Gly 


gcg 

Ala 


get gcg 
Ala Ala 
2965 


gee 
Ala 


etg 
Leu 


ccg 
Pro 


aca atg 
Thr Met 
2970 


ccc 
Pro 


get 

Ala 


10956 


ttg 
Leu 


cat aac 
His Asn 
2975 


age 
Ser 


ggc 
Gly 


ate 
He 


etg gcg 
Leu Ala 
2980 


gat 
Asp 


ctt 
Leu 


aat 
Asn 


ggg gat 
Gly Asp 
2985 


ggt 
Gly 


egg 

Arg 


etg 
Leu 


11004 


gag tgg 
Glu Trp 
2990 


gtc 
Val 


gtt 
Val 


ace 
Thr 


gee ccc 
Ala Pro 
2995 


ggt 
Gly 


gtg 
Val 


gcg 
Ala 


_ggg atg 

Gly Met 
3000 


tat 
Tyr 


gat 
Asp 


cgc 
Arg 


acc 
Thr 
3005 


11052 


ccc 
Pro 


ggc 
Gly 


cgc 
Arg 


gac 
Asp 


tgg ttg 
Trp Leu 
3010 


cat 
His 


ttc 
Phe 


ace 
Thr 


ccc etg 
Pro Leu 
3015 


tea 
Ser 


gee 
Ala 


ttg 
Leu 


ccc gta 
Pro Val 
3020 


11100 


gaa 
Glu 


tat 
Tyr 


gcg 
Ala 


cat cca 
His Pro 
3025 


aaa 
Lys 


gca 
Ala 


gtg 
Val 


etc gee 
Leu Ala 
3030 


gat 
Asp 


ate 
lie 


etg 
Leu 


ggg get 

Gly Ala 
3035 


ggg 

Gly 


11148 


tta 
Leu 


acg 
Thr 


gac atg 
Asp Met 
3040 


gtg 
Val 


ctt 
Leu 


ate 
He 


ggg ccg 
Gly Pro 
3045 


cgc 
Arg 


agt 
Ser 


gtt 
Val 


cgc etc 
Arg Leu 
3050 


tat 
Tyr 


tec 
Ser 


11196 


ggc 
Gly 


aaa aac 
Lys Asn 
3055 


gat 
Asp 


ggt 
Gly 


tgg 

Trp 


aat aaa 
Asn Lys 
3060 


ggg 

Gly 


gag 
Glu 


acc 
Thr 


gtg cag 
Val Gin 
3065 


caa 
Gin 


acg 
Thr 


gaa 
Glu 


11244 


aga etc 
Arg Leu 
3070 


act 
Thr 


etg 
Leu 


ccg 
Pro 


gtc ccg 
Val Pro 
3075 


ggg 

Gly 


gtt 
Val 


gac 
Asp 


cca cgt 
Pro Arg 
3 080 


acc 
Thr 


etc 
Leu 


gtg 
Val 


gcg 
Ala 
3085 


11292 


ttc 
Phe 


agt 
Ser 


gat 
Asp 


atg 
Met 


get ggc 
Ala Gly 
3090 


agt 
Ser 


gga 
Gly 


cag 
Gin 


cag cat 
Gin His 
3095 


ttg 
Leu 


acg 
Thr 


gag 
Glu 


gtg cgt 
Val Arg 
3100 


11340 
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get aat gga gta cgt tac tgg cca aac ctg ggg cac ggt cgt ttc ggt 
Ala Asn Gly Val Arg Tyr Trp Pro Asn Leu Gly His Gly Arg Phe Gly 
3105 ^ 3110 3115 



11388 



cag ccg gtg aat att ccc ggt ttt age cag tea gtg act acg ttt aac 
Gin Pro Val Asn lie Pro Gly Phe Ser Gin Ser Val Thr Thr Phe Asn 
3120 3125 3130 



11436 



cct gac cag ata ttg ctg gec gat acc gac ggt tec ggt ace acg gac 
Pro Asp Gin lie Leu Leu Ala Asp Thr Asp Gly Ser Gly Thr Thr Asp 
3135 3140 3145 



11484 



ctg att tat gcg atg agt gac egg tta gtc att tat ttc aac cag agt 
Leu lie Tyr Ala Met Ser Asp Arg Leu Val lie Tyr Phe Asn Gin Ser 
3150 3155 3160 " 3165 



11532 



ggt aat tat ttc gee gag ccg cat acg ctg etc ttg ccg aaa ggt gtg 
Gly Asn Tyr Phe Ala Glu Pro His Thr Leu Leu Leu Pro Lys Gly Val 
3170 3175 3180 



11580 



cgc tat gat cgc acc tgc agt ctg caa gtg gcg gat ate cag ggg ctg 
Arg Tyr Asp Arg Thr Cys Ser Leu Gin Val Ala Asp lie Gin Gly Leu 
3185 3190 3195 



11628 



ggg gtg cct age ctg tta ctg acg gtc ccc cat gtc gcg cct cat cac 
Gly Val Pro Ser Leu Leu Leu Thr Val Pro His Val Ala Pro His His 
3200 3205 3210 



11676 



tgg gtg tgc cat tta teg gca gac aaa ccc tgg ttg ttg aat ggc atg 
Trp Val Cys His Leu Ser Ala Asp Lys Pro Trp Leu Leu Asn Gly Met 
3215 ^ 3220 ~ 3225 



11724 



aac aac aat atg ggg gee egg cat gca ctg cac tat cgc agt teg gtg 
Asn Asn Asn Met Gly Ala Arg His Ala Leu His Tyr Arg Ser Ser Val 
3230 3235 3240 3245 



11772 



cag ttc tgg ctg gat gag aaa gee gag gca ctg gcg gca ggc agt tec 
Gin Phe Trp Leu Asp Glu Lys Ala Glu Ala Leu Ala Ala Gly Ser Ser 
3250 ^ 3255 3260 



11820 



cct gee tgc tac etg cca ttt aca ttg cat acc ctg tgg cgt teg gtg 
Pro Ala Cys Tyr Leu Pro Phe Thr Leu His Thr Leu Trp Arg Ser Val 
3265 3270 3275 



11868 



gtg cag gat gag ate acc ggt aac cgt ctg gtc age gac gtg ctt tat 
Val Gin Asp Glu lie Thr Gly Asn Arg Leu Val Ser Asp Val Leu Tyr 
3280 3285 3290 



11916 



cgc cac ggc gtc tgg gac ggg cag gaa cgc gag ttt egg ggg ttt ggt 
Arg His Gly Val Trp Asp Gly Gin Glu Arg Glu Phe Arg Gly Phe Gly 
3295 3300 ~ 3305 



11964 



ttt gtt gag ate agg gat acc gat acc ttg gca age cag ggt acg gcg 
Phe Val Glu lie Arg Asp Thr Asp Thr Leu Ala Ser Gin Gly Thr Ala 
3310 3315 3320 " 3325 



12012 



acg gaa ctg agt atg cct tct gtg age egg aac tgg tat gee acc ggg 
Thr Glu Leu Ser Met Pro Ser Val Ser Arg Asn Trp Tyr Ala Thr Gly 
3330 3335 3340 



12060 



gta ccg gca gta gac gag cgt ctg ccg gag acg tat tgg caa aac gat 
Val Pro Ala Val Asp Glu Arg Leu Pro Glu Thr Tyr Trp Gin Asn Asp 
3345 " 3350 ' 3355 



12108 



gee gee get ttt gee gat ttc gcg acc cgt ttc act gtc ggt tea gga 
Ala Ala Ala Phe Ala Asp Phe Ala Thr Arg Phe Thr Val Gly Ser Gly 



12156 
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3360 3365 3370 

gag gat gag cag aca tat act ccg gac gac age aag aca ttc tgg ttg 

Glu Asp Glu Gin Thr Tyr Thr Pro Asp Asp Ser Lys Thr Phe Trp Leu 
3375 3380 3385 



gat ggc age age cag gee gat ate cct tac age gtc act gag tct cgc 
Asp Gly Ser Ser Gin Ala Asp lie Pro Tyr Ser Val Thr Glu Ser Arg 
3410 3415 3420 



ccg atg ggc gcg gaa age cgt acg tea gtt tat gaa egg tac cac aat 
Pro Met Gly Ala Glu Ser Arg Thr Ser Val Tyr Glu Arg Tyr His Asn 
3440 3445 3450 



gcg gac aat cca tat ccg gcg tec tta ccg gcg acg ctg ttc gee aac 
Ala Asp ^Asn Pro Tyr Pro Ala Ser Leu Pro Ala Thr Leu Phe Ala Asn 
3490 3495 3500 



12204 



cag cga gee ctg aaa ggc ate ctg ctg cgc agt gag tta tac ggt gee 12252 
Gin Arg Ala Leu Lys Gly lie Leu Leu Arg Ser Glu Leu Tyr Gly Ala 
3390 3395 3400 3405 



gtc agt tea ctg get gee tac att gtg gat gaa cat etc gag caa gee 
Val Ser Ser Leu Ala Ala Tyr lie Val Asp Glu His Leu Glu Gin Ala 
3615 3620 3625 

ggt tac egg caa tec gga tac ctt ttc cct cga ggc agg gaa gca gaa 



12300 



ccg cag gta egg eta gtt gaa gcg aat gga gac tac ccg gtg gtg tgg 12348 
Pro Gin Val Arg Leu Val Glu Ala Asn Gly Asp Tyr Pro Val Val Trp 
3425 3430 3435 



12396 



gat cct caa tgc caa cag cag gcg gta etc etc agt gat gaa tac ggt 12444 

Asp Pro Gin Cys Gin Gin Gin Ala Val Leu Leu Ser Asp Glu Tyr Gly 
3455 3460 3465 

ttc cca ctg cgt cag gtc agt gtc aat tat cca cga cgc cct ccg teg 12492 

Phe Pro Leu Arg Gin Val Ser Val Asn Tyr Pro Arg Arg Pro Pro Ser 

3470 ~ 3475 3480 3485 



12540 



agt tat gac gag cag cag cag ata tta cgc ctg ggg ttg caa cag age 12588 
Ser Tyr Asp Glu Gin Gin Gin lie Leu Arg Leu Gly Leu Gin Gin Ser 
3505 3510 3515 

agt gca cat cac ctt gtt tea ctg tct gag ggg cat tgg ttg ttg ggg 12636 
Ser Ala His His Leu Val Ser Leu Ser Glu Gly His Trp Leu Leu Gly 
3520 3525 3530 

ttg gcg gag gcg teg egg gac gat gta ttc acg tac tct gcg gac aac 12684 
Leu Ala Glu Ala Ser Arg Asp Asp Val Phe Thr Tyr Ser Ala Asp Asn 
3535 3540 3545 

gtg ccg gaa ggg ggt ctg acg ctg gaa cac ctg ttg gcg ccc gaa age 12732 
Val Pro Glu Gly Gly Leu Thr Leu Glu His Leu Leu Ala Pro Glu Ser 
3550 3555 3560 3565 

ctg gtc teg gat agt cag gtc ggt acg ctg gcg ggt cag cag caa gtc 12780 
Leu Val Ser Asp Ser Gin Val Gly Thr Leu Ala Gly Gin Gin Gin Val 
3570 3575 3580 

tgg tat ctg gat tea caa gac gtt gee ace gtc get get ccg cca etc 12828 
Trp Tyr Leu Asp Ser Gin Asp Val Ala Thr Val Ala Ala Pro Pro Leu 
3585 3590 3595 

ccc ccc aag gta get ttt ate gaa acg gee gtg ctg gat gag ggt atg 12 876 
Pro Pro Lys Val Ala Phe lie Glu Thr Ala Val Leu Asp Glu Gly Met 
3600 3605 3610 



12924 



12972 
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Gly Tyr Arg Gin Ser Gly Tyr Leu Phe Pro Arg Gly Arg Glu Ala Glu 
3630 3635 3640 3645 

cag gca ttg tgg acc cag tgt cag gga tat gtt acc tat gcc ggc gca 

Gin Ala Leu Trp Thr Gin Gys Gin Gly Tyr Val Thr Tyr Ala Gly Ala 

3650 3655 3660 

gag cat ttc tgg eta ccg eta tec ttt egg gac agt atg ttg acc ggc 

Glu His Phe Trp Leu Pro Leu Ser Phe Arg Asp Ser Met Leu Thr Gly 
3665 3670 3675 



gac ggc gca gca gcc get ctg gcg ttg acg gcg ccc eta cca gta gca 
Asp Gly Ala Ala Ala Ala Leu Ala Leu Thr Ala Pro Leu Pro Val Ala 
3760 3765 3770 



13020 



13068 



cca gtt acc gtg acg cgt gac gcg tac gac tgc gtc ate acg cag tgg 13116 

Pro Val Thr Val Thr Arg Asp Ala Tyr Asp Cys Val lie Thr Gin Trp 
3680 3685 3690 

cag gat gcc gca ggg att gtc acc aca gcc gac tat gac tgg cgc ttc 13164 

Gin Asp Ala Ala Gly lie Val Thr Thr Ala Asp Tyr Asp Trp Arg Phe 
3695 3700 3705 

ctg acg ccc gtc egg gtg acg gac ccc aat gat aat etg cag tec gtc 13212 

Leu Thr Pro Val Arg Val Thr Asp Pro Asn Asp Asn Leu Gin Ser Val 

3710 " 3715 3720 3725 

act ctg gat get ctg ggc egg gtg acc acc ctg cga ttc tgg ggc acg 13260 

Thr Leu Asp Ala Leu Gly Arg Val Thr Thr Leu Arg Phe Trp Gly Thr 
3730 3735 3740 

gag aat ggt att gcc acc ggt tac agt gat gee acg ttg tec gtt ccg 13308 

Glu Asn Gly lie Ala Thr Gly Tyr Ser Asp Ala Thr Leu Ser Val Pro 
3745 3750 3755 



13356 



cag tgt ctg gtg tat gtc acg gac agt tgg gga gat gac gac aat gag 13404 
Gin Cys Leu Val Tyr Val Thr Asp Ser Trp Gly Asp Asp Asp Asn Glu 
3775 3780 3785 

aaa atg ccc ccg cac gtg gtc gtg ctg get acc gat cgc tat gac agt 13452 
Lys Met Pro Pro His Val Val Val Leu Ala Thr Asp Arg Tyr Asp Ser 
3790 3795 3800 3805 

gat acc gga cag cag gtc cgc caa cag gtg aca ttc agt gac ggt ttt 13500 
Asp Thr Gly Gin Gin Val Arg Gin Gin Val Thr Phe Ser Asp Gly Phe 
3810 3815 3820 

ggg cgt gag ttg caa teg gca acc egg cag gcc gag ggc aac gcc tgg 13 54 8 
Gly Arg Glu Leu Gin Ser Ala Thr Arg Gin Ala Glu Gly Asn Ala Trp 
3825 3830 3835 

caa cga gga cgc gac ggc aaa ctg gtg acg gcc agt gac gga ttg ccg 13596 
Gin Arg Gly Arg Asp Gly Lys Leu Val Thr Ala Ser Asp Gly Leu Pro 
3840 3845 3850 

gtc act gta gca acg aat ttc cgc tgg gcg gtc acc ggg agg gcg gag 13 644 
Val Thr Val Ala Thr Asn Phe Arg Trp Ala Val Thr Gly Arg Ala Glu 
3855 3860 3865 

tat gac aat aaa ggt ctg cct gtt egg gtt tat cag ccg tat ttt ctg 13692 
Tyr Asp Asn Lys Gly Leu Pro Val Arg Val Tyr Gin Pro Tyr Phe Leu 
3870 3875 3880 3885 

gac agt tgg caa tat gtc agt gat gac agt gcc cgc cag gac ctg tat 13 74 0 
Asp Ser Trp Gin Tyr Val Ser Asp Asp Ser Ala Arg Gin Asp Leu Tyr 
3890 3895 3900 
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gcc gac acg cac ttt tac gat ccg acg gca egg gaa tgg cag gtt att 13788 
Ala Asp Thr His Phe Tyr Asp Pro Thr Ala Arg Glu Trp Gin Val lie 
3905 3910 3915 



acg gca aaa ggt gaa egg cga cag gtg ctg tat acc ccg tgg ttt gtg 
Thr Ala Lys Gly Glu Arg Arg Gin Val Leu Tyr Thr Pro Trp Phe Val 
3920 3925 3930 



13836 



gtc agt gaa gac gag aat gat acc gtt ggg eta aac gac gca tec tga 13 884 
Val Ser Glu Asp Glu Asn Asp Thr Val Gly Leu Asn Asp Ala Ser * 
3935 " 3940 3945 

ctgggaagga gggggggacg gtg atg agt ccg teg ccc ctg aca ggc get gcc 13 93 7 

Met Ser Pro Ser Pro Leu Thr Gly Ala Ala 
3950 3955 

ctg atg gag aca aag atg aaa ata cac tat cag gtt gcg gcg gtt gtg 13 985 
Leu Met Glu Thr Lys Met Lys lie His Tyr Gin Val Ala Ala Val Val 
3960 " 3965 3970 

ctg aca ggt gtt atg gtt tgg ggg ctt tec cat tgg cgt tac acc gtc 14033 
Leu Thr Gly Val Met Val Trp Gly Leu Ser His Trp Arg Tyr Thr Val 
3975 3980 3985 3990 

ggt tac cac gcg gca gat act caa tgg caa caa cgc cag gcc gaa cag 14081 
Gly Tyr His Ala Ala Asp Thr Gin Trp Gin Gin Arg Gin Ala Glu Gin 
3995 4000 4005 

gaa agg gcc gat gcg ttg gcc etc ctg gca gca gaa acc egg gaa aga 14129 
Glu Arg Ala Asp Ala Leu Ala Leu Leu Ala Ala Glu Thr Arg Glu Arg 
4010 4015 4020 

aag tgg gag cag caa cga cag act gac atg aac aag gtg get ata cat 1417 7 
Lys Trp Glu Gin Gin Arg Gin Thr Asp Met Asn Lys Val Ala lie His 
4025 4030 4035 

get gaa gaa gaa ctg get get gcg cgt gac get gcc get gat get cag 14225 
Ala Glu Glu Glu Leu Ala Ala Ala Arg Asp Ala Ala Ala Asp Ala Gin 
4040 4045 4050 

cgc act ggt cag cgc ctg cag cac acc gtt acc acc etc cag egg caa 14273 
Arg Thr Gly Gin Arg Leu Gin His Thr Val Thr Thr Leu Gin Arg Gin 
4055 4060 4065 4070 

ctt gcc agt cgt gaa acc cgc cgc ctt tec gca get acc get ate ggt 14321 
Leu Ala Ser Arg Glu Thr Arg Arg Leu Ser Ala Ala Thr Ala lie Gly 
4075 4080 4085 

aca gac gac etc gga ggc caa ccc ggc gtt ttg ttt gcc gaa ctg ttc 14369 
Thr Asp Asp Leu Gly Gly Gin Pro Gly Val Leu Phe Ala Glu Leu Phe 
4090 4095 4100 

cgc cgc get gac cag aga gcg gga gag ctg gca gcg tat get gac agg 14417 
Arg Arg Ala Asp Gin Arg Ala Gly Glu Leu Ala Ala Tyr Ala Asp Arg 
4105 * 4110 4115 

acc aga gtg aaa tgg cag gcc tgc ggg cgc gcc tat cag gcg get acg 144 65 
Thr Arg Val Lys Trp Gin Ala Cys Gly Arg Ala Tyr Gin Ala Ala Thr 
4120 4125 4130 

cac gaa gca gaa aaa taa ggegatttag ccgttaagga aaagtgacgg 14513 

His Glu Ala Glu Lys * 

4135 

tgttttcgcg attaatatta acaggagatc ac atg age aca tec ttg ttc agt 14566 

Met Ser Thr Ser Leu Phe Ser 
4140 4145 

17 
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age acc ccg teg gtc gcg gtg etc gac aac cgc ggc ctg ttg gtg egg 14614 
Ser Thr Pro Ser Val Ala Val Leu Asp Asn Arg Gly Leu Leu Val Arg 
4150 4155 4160 

gag ctg cag tac tac cgc cat ccg gat aca ccg gag gag acg gac gag 14662 
Glu Leu Gin Tyr Tyr Arg His Pro Asp Thr Pro Glu Glu Thr Asp Glu 
4165 4170 4175 

cgt ate acc tgc cat cag cac gat gag cgc ggc age ttg tea caa age 14710 
Arg lie Thr Cys His Gin His Asp Glu Arg Gly Ser Leu Ser Gin Ser 
4180 4185 4190 

gee gac ccg egg tta cac gcg gee ggt ctg aca aat ttc acg tac ctg 14758 
Ala Asp Pro Arg Leu His Ala Ala Gly Leu Thr Asn Phe Thr Tyr Leu 
4195 4200 4205 4210 

aat age ctg acc ggg aca gta ctg cag age gtc age gee gat gee ggt 14806 
Asn Ser Leu Thr Gly Thr Val Leu Gin Ser Val Ser Ala Asp Ala Gly 
4215 4220 4225 

acg teg ctg gaa ctg age gat gee gee ggg egg gcg ttt ctg gee gtc 14854 
Thr Ser Leu Glu Leu Ser Asp Ala Ala Gly Arg Ala Phe Leu Ala Val 
4230 4235 4240 



acc ggg get ggg acg gaa gac gcg gtc acc cgc acc tgg caa tat gaa 
Thr Gly Ala Gly Thr Glu Asp Ala Val Thr Arg Thr Trp Gin Tyr Glu 
4245 4250 4255 



14902 



15046 



15094 



15142 



gac gat acc ctg ccg ggc cgc ccg ctg age ate acc gag cag gtt acc 14950 
-Asp Asp Thr Leu Pro Gly Arg Pro Leu Ser lie Thr Glu Gin Val Thr 
4260 4265 4270 

ggt gaa gee gee caa att acg gaa cgc ttc gtg tac get ggc aat acg 14998 
Gly Glu Ala Ala Gin lie Thr Glu Arg Phe Val Tyr Ala Gly Asn Thr 
4275 4280 4285 4290 

gat gee gag aag att etc aat ctg get ggc cag tgt gtc agt cat tac 
Asp Ala Glu Lys lie Leu Asn Leu Ala Gly Gin Cys Val Ser His Tyr 
4295 4300 4305 

gat acc gee gga ctg gtg cag acg gac age ate gee ctg age ggc gtg 
Asp Thr Ala Gly Leu Val Gin Thr Asp Ser lie Ala Leu Ser Gly Val 
4310 4315 4320 

ccg etc gee gtc acg egg cag ttg ctg ccc gac gcg gcg ggg gec aac 
Pro Leu Ala Val Thr Arg Gin Leu Leu Pro Asp Ala Ala Gly Ala Asn 
4325 4330 4335 

tgg atg ggt gag gat gee teg gee tgg aat gac ctg ctg gat ggg gag 15190 
Trp Met Gly Glu Asp Ala Ser Ala Trp Asn Asp Leu Leu Asp Gly Glu 
4340 4345 4350 

acg ttc ttc acc cag acc cac get gat gcg acc ggc gec gtc ctg age 15238 
Thr Phe Phe Thr Gin Thr His Ala Asp Ala Thr Gly Ala Val Leu Ser 
4355 4360 4365 4370 

ate acc gat gca aaa ggt aat ctg cag cgt gtg gca tat gat gtg get 15286 
lie Thr Asp Ala Lys Gly Asn Leu Gin Arg Val Ala Tyr Asp Val Ala 
4375 4380 4385 

ggg ctg eta teg ggc agt tgg ttg acg ctg aag gac ggc acg gag cag 15334 
Gly Leu Leu Ser Gly Ser Trp Leu Thr Leu Lys Asp Gly Thr Glu Gin 
4390 ~ 4395 4400 

gtc ate gtg gec tec ctg acg tac teg gec gec ggg aaa aag ttg cgt 15382 
Val lie Val Ala Ser Leu Thr Tyr Ser Ala Ala Gly Lys Lys Leu Arg 
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4405 



4410 



4415 



gaa gaa cac ggc aac ggc gtg gta acc teg tat att tac gag ccg gaa 15430 
Glu Glu His Gly Asn Gly Val Val Thr Ser Tyr He Tyr Glu Pro Glu 
4420 4425 4430 

aca cag cgc ctg acg ggg att aaa acg gaa cgt ccg tct ggg cac gtt 
Thr Gin Arg Leu Thr Gly He Lys Thr Glu Arg Pro Ser Gly His Val 
4435 ~ 4440 4445 4450 

gec gga gca aaa gtg ctg cag gac ctg cgc tat acg tat gac ccg gta 
Ala Gly Ala Lys Val Leu Gin Asp Leu Arg Tyr Thr Tyr Asp Pro Val 
4455 4460 4465 

ggc aac gta etc age gtc aat aac gat gcg gaa gag acc cgc ttc tgg 
Gly Asn Val Leu Ser Val Asn Asn Asp Ala Glu Glu Thr Arg Phe Trp 
4470 4475 4480 

cgt aac cag aaa gtg gta ccg gag aat acg tac ate tac gac age ctg 
Arg Asn Gin Lys Val Val Pro Glu Asn Thr Tyr lie Tyr Asp Ser Leu 
4485 - 4490 4495 

tac cag ctg gtc age gec aca ggg cgt gag atg gee aat gee ggc cag 
Tyr Gin Leu Val Ser Ala Thr Gly Arg Glu Met Ala Asn Ala Gly Gin 
4500 4505 4510 

cag ggc aac gac tta cca tec get aca gee ccc ctt cct aca gac age 
Gin Gly Asn Asp Leu Pro Ser Ala Thr Ala Pro Leu Pro Thr Asp Ser 
4515 ^ 4520 4525 4530 

tct gee tac acc aat tac acg cgc acc tac cgt tat gac cgt ggt ggc 
Ser Ala Tyr Thr Asn Tyr Thr Arg Thr Tyr Arg Tyr Asp Arg Gly Gly 
4535 4540 4545 

aac ctg acg cag atg cgc cac agt gee cct gee acg aac aat aat tat 
Asn Leu Thr Gin Met Arg His Ser Ala Pro Ala Thr Asn Asn Asn Tyr 
4550 4555 4560 

acg aca gac ate acg gtt agt gac cgc age aat agg gcg gta ctg age 
Thr Thr Asp He Thr Val Ser Asp Arg Ser Asn Arg Ala Val Leu Ser 
4565 4570 4575 

acg ttg gcg gaa gtg ccg tea gat gtt gat atg ctg ttc agt gca gga 
Thr Leu Ala Glu Val Pro Ser Asp Val Asp Met Leu Phe Ser Ala Gly 
4580 4585 4590 

ggt cac cag aag cac ctg cag ccg ggg caa gca ctg gtg tgg acg cca 
Gly His Gin Lys His Leu Gin Pro Gly Gin Ala Leu Val Trp Thr Pro 
4595 ' 4600 4605 4610 

cgt gga gaa ctg caa aag gtg aca ccg gtg gtg cgt gat ggg ggg gcg 
Arg Gly Glu Leu Gin Lys Val Thr Pro Val Val Arg Asp Gly Gly Ala 
4615 4620 4625 

gac gac age gaa age tat egg tat gat gcg ggc agt cag cgt att ate 
Asp Asp Ser Glu Ser Tyr Arg Tyr Asp Ala Gly Ser Gin Arg He He 
4630 4635 4640 

aaa acc ggc acg egg caa act ggc aac aac gtt cag aca cag egg gta 
Lys Thr Gly Thr Arg Gin Thr Gly Asn Asn Val Gin Thr Gin Arg Val 
4645 4650 4655 

gtg tac ctg ccg ggg ctg gag tta cgt ate atg gca aat ggc gtg acg 
Val Tyr Leu Pro Gly Leu Glu Leu Arg He Met Ala Asn Gly Val Thr 
4660 " 4665 4670 

gaa aaa gaa age ctg cag gtt att acg gtg ggc gag get ggg egg gca 16198 



15478 



15526 



15574 



15622 



15670 



15718 



15766 



15814 



15862 



15910 



15958 



16006 



16054 



16102 



16150 
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Glu Lys Glu Ser Leu Gin Val lie Thr Val Gly Glu Ala Gly Arg Ala 
4675 4680 4685 4690 

caa gtg cgc gta ttg cac tgg gag ate ggc aag ccg gat gac etc gat 16246 
Gin Val Arg Val Leu His Trp Glu lie Gly Lys Pro Asp Asp Leu Asp 
4695 4700 4705 

gag gae teg gtg cgt tac agt tac gat aac ctg gtg ggc age age cag 162 94 
Glu Asp Ser Val Arg Tyr Ser Tyr Asp Asn Leu Val Gly Ser Ser Gin 
4710 4715 4720 

ctg gag ctg gac aga gag ggt tac ctt ate agt gag gag gag ttc tac 16342 
Leu Glu Leu Asp Arg Glu Gly Tyr Leu He Ser Glu Glu Glu Phe Tyr 
4725 4730 4735 

ccg tat ggc gga acg get gtt ctg acg gcg cga agt gag gtt gag get 163 90 
Pro Tyr Gly Gly Thr Ala Val Leu Thr Ala Arg Ser Glu Val Glu Ala 
4740 4745 4750 

gac tac aaa act ate cga tac tea ggc aag gag cgt gac gcg acg ggg 16438 
Asp Tyr Lys Thr He Arg Tyr Ser Gly Lys Glu Arg Asp Ala Thr Gly 
4755 4760 4765 4770 

ctg gat tat tac ggt tat egg tat tac cag cca tgg gca ggg cgc tgg 16486 
Leu Asp Tyr Tyr Gly Tyr Arg Tyr Tyr Gin Pro Trp Ala Gly Arg Trp 
4775 4780 4785 

etc tec acg gac ccg gca ggc acg gtg gac ggg ctg aac ctg ttc cgc 16534 
Leu Ser Thr Asp Pro Ala Gly Thr Val Asp Gly Leu Asn Leu Phe Arg 
4790 4795 4800 

atg gtg egg aat aat ccc gtc acg ctg ttt gac age aac ggg egg ate 16582 
Met Val Arg Asn Asn Pro Val Thr Leu Phe Asp Ser Asn Gly Arg lie 
4805 4810 4815 



agt act ggt cag gag gee aga cga tta gtg ggg gaa gca ttt gtt cat 
Ser Thr Gly Gin Glu Ala Arg Arg Leu Val Gly Glu Ala Phe Val His 
4820 4825 4830 

ccg tta cac atg cct gtt ttt gaa aga att tct gta gag aga aag att 
Pro Leu His Met Pro Val Phe Glu Arg He Ser Val Glu Arg Lys He 
4835 4840 4845 4850 



16630 



16678 



tea atg age gta agg gaa get ggc att tat act att tea gcg ctg ggt v 16726 
Ser Met Ser Val Arg Glu Ala Gly He Tyr Thr He Ser Ala Leu Gly 
4855 4860 4865 

gaa ggt gca gca gca aaa ggc cat aat att eta gag aaa acc att aaa 16774 
Glu Gly Ala Ala Ala Lys Gly His Asn He Leu Glu Lys Thr He Lys 
4870 4875 4880 

ccc ggt tec ctg aag get ate tat ggt gat aaa get gag tea att ctt 16822 
Pro Gly Ser Leu Lys Ala He Tyr Gly Asp Lys Ala Glu Ser He Leu 
4885 4890 4895 

gga ctg gca aaa cgt age ggt etc gtt ggc cga gta gga cag tgg gat 16870 
Gly Leu Ala Lys Arg Ser Gly Leu Val Gly Arg Val Gly Gin Trp Asp 
4900 ~ 4905 4910 

gca tea ggt gta cgt gga att tat gcg cac aac aga ccg ggt ggt gag 16 918 
Ala Ser Gly Val Arg Gly He Tyr Ala His Asn Arg Pro Gly Gly Glu 
4915 4920 4925 4930 



gat ttg gtt tat cct gtc age ctg cag aat act tct gee aat gaa att 
Asp Leu Val Tyr Pro Val Ser Leu Gin Asn Thr Ser Ala Asn Glu He 
4935 4940 4945 



16966 



20 



;l O pi y a WB n 3 * Q 9 :l, 713 



gtt aat gca tgg ata aaa ttt aaa ate ate acg ccc tac acc ggg gat 
Val Asn Ala Trp He Lys Phe Lys He He Thr Pro Tyr Thr Gly Asp 
4950 4955 4960 

tat gac atg cac gat att att aaa ttc tct gat ggg aaa ggg cat gtg 
Tyr Asp Met His Asp He He Lys Phe Ser Asp Gly Lys Gly His Val 
4965 4970 4975 

cct aca gcg gaa agt agt gag gaa aga gga gta aaa gat eta att aat 
Pro Thr Ala Glu Ser Ser Glu Glu Arg Gly Val Lys Asp Leu He Asn 
4980 4985 4990 

aaa ggt gtt gcg gag gtc gat cct tec aga ccc ttt gag tat aca gcg 
Lys Gly Val Ala Glu Val Asp Pro Ser Arg Pro Phe Glu Tyr Thr Ala 
4995 5000 5005 5010 

atg aat gtt att cgc cat gga cea cag gtg aac ttt gtt ccc tat atg 
Met Asn Val He Arg His Gly Pro Gin Val Asn Phe Val Pro Tyr Met 
5015 5020 5025 

tgg gaa cat gag cac gat aaa gtc gtt aat gat aat ggt tat ctg ggg 
Trp Glu His Glu His Asp Lys Val Val Asn Asp Asn Gly Tyr Leu Gly 
5030 5035 5040 

gtg gta get age ccg ggg ccg ttc ccg gta gcg atg gta cat cag ggg 
Val Val Ala Ser Pro Gly Pro Phe Pro Val Ala Met Val His Gin Gly 
5045 5050 5055 

gaa tgg act gtt ttt gac aac agt gaa gaa ctg ttt aat ttc tat aaa 
Glu Trp Thr Val Phe Asp Asn Ser Glu Glu Leu Phe Asn Phe Tyr Lys 
5060 5065 5070 

tct aca aat aca cct ctt cct gaa cac tgg tec caa gat ttt atg gac 
Ser Thr Asn Thr Pro Leu Pro Glu His Trp Ser Gin Asp Phe Met Asp 
5075 5080 5085 L 5090 

aga ggg aaa gga ata gtc gca act cct egg cat get gaa ctt ctt gat 
Arg Gly Lys Gly He Val Ala Thr Pro Arg His Ala Glu Leu Leu Asp 
5095 5100 5105 

aaa cga cga gtc atg tac taa tegtaacgat ttcctgcctt acccaaagta 
Lys Arg Arg Val Met Tyr * 
5110 



17014 



17062 



17110 



17158 



17206 



17254 



17302 



17350 



17398 



17446 



17497 



tacagcccgg 
tatgtcttcc 
atgttcgcgt 
cctgctgcga 
aactgeaget 
atcggctgcc 
atggtatttc 
cacaccggtg 
gtggacttga 
agagttgeae 
ttgctgtgtt 
aagccatcgt 
gcccgctacg 
atcggtgccc 
taccgtcatc 
agaaaaccat 
tgeagcatet 
tactgatgea 
cgcttcaggt 
aaattcactt 
gagacccggc 
ggcttccacc 
accgcegcgg 



tgagacattt 
ctcatctaaa 
ttcaaccgac 
aattatccgt 
tetgetgget 
accaaaaagt 
catcaccact 
tgatcgctgg 
tggttaggag 
actcccagat 
tttaatccaa 
ttttttgccg 
gtaceggegg 
aggaagcett 
tccagcgcgt 
agtaaegcac 
cctcctggct 
ccacggcccc 
accgggccag 
tecaggggeg 
agggegecag 
tctttctttt 
gtaaeggaga 



tctctgtctc 
gtctaacgag 
ggtceggatt 
gcgaaaaaag 
tttttgegge 
ccggagcgtg 
gtatatcgea 
aagccccggg 
attgaatcga 
ggcgtggctt 
aacctgettt 
tacgatgtag 
caaaaegcag 
tcatcagcac 
egtaaacett 
cattttaaaa 
gattttctgg 
gecgeggtag 
gtatttcacg 
gcggtattgc 
gattgatgeg 
taaagaacag 
cgtggatatg 



atttgggttg 
acatttttag 
ttactctgta 
ccagcggcag 
caggcaacat 
cggcccagat 
cactctgggc 
cattaccgcc 
ccatttttga 
agegagegat 
tcaggcgcac 
cctgtcagag 
ccggcctttg 
cgcgaacccg 
cggcagcagc 
tgccgtgcag 
cgtttgtgct 
tggegtaget 
ctgcgccagg 
geatgeaggg 
cagcaggtga 
ctgccgccag 
cggatgttga 



tttttgtctc 
caaaatggca 
aatacagaca 
cagcegggat 
gctgatggtt 
cgccgcaata 
cttccagaaa 
gtctgtactc 
gatccctaac 
tatgettaaa 
ttatccagct 
agcatttttg 
cagaggatgc 
ggccgtttcg 
gtgcccgttt 
ggatatggct 
getgegtacg 
gagaagcege 
cgccgcgggt 
tcttcgttgc 
aegaeggcat 
acgtggtgtt 
ttgagctgee 



atetgeatgt 
etttaeggtt 
cttcgcgcag 
ggacgaaatg 
acgtgagttg 
atactgetgt 
ccccataccg 
gaacactatt 
catagatcgt 
aattcatgtt 
aeggggtctg 
tggcgtgctc 
actggtaegg 
gtttctcccg 
gcggttggcc 
gaegtaaege 
gtgatcgtaa 
caccggcggg 
ctttttggca 
ggatatggcc 
tgegecagat 
tgacgtcaag 
ggccttaggt 



17557 
17617 
17677 
17737 
17797 
17857 
17917 
17977 
18037 
18097 
18157 
18217 
18277 
18337 
18397 
18457 
18517 
18577 
18637 
18697 
18757 
18817 
18877 
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gtggagcgcg caaaaaatgc cggcctcgat gccctgccgg cgtgcccagc ggagcatggc 



<210> 2 
<211> 144 
<212> PRT 

<213> Serratia entomophila 

<220> , , . 

<223> ORF1 amino acid sequence encoding an insecticidal protein when 
linked with at least SEQ ID NO : 1 



<400> 2 


























Glu 


Gly 


Met 


Lys 


He 


Ser 


Ser 


Arg 


Gly 


He 


Ala 


Leu 


He 


Lys 


Glu 


Phe 


1 






5 










10 










15 




Leu 


Arg 


Leu 


His 


Ala 


Tyr 


Arg 


Cys 


Ala 


Ala 


Asp 


Val 


Trp 


Thr 


Val 


Gly 






20 










25 










30 






Tyr 


Gly 


His 


Thr 


Ala 


Gly 


Val 


Thr 


Lys 


Gly Asp 


He 


He 


Thr 


Val 


Asp 


35 










40 










45 








Glu 


Ala 


Gin 


Thr 


Met 


Leu 


Thr 


Asn 


Asp 


He 


Thr 


Val 


Phe 


Glu 


Arg 


Ala 




50 










55 










60 








Ala 


Val 


Ser 


Gin 


Ala 


Val 


Ala 


Val 


Pro 


Leu 


Asn 


Gin 


Ser 


Gin 


Tyr Asp 


65 










70 










75 










80 


Leu 


Val 


Ser 


Leu 


Val 


Phe 


Asn 


He 


Gly 


Gin 


Gly 


Asn 


Phe 


Lys 


Arg 


Ser 










85 










90 










95 




Thr 


Leu 


Leu 


Lys 


Lys 


Leu 


Asn 


Lys 


Gin 


Asp 


Tyr 


Val 


Gly 


Ala 


Gly 


Asn 








100 








105 










110 






Glu 


Phe 


Leu 


Arg 


Trp 


Thr 


Arg 


Ala 


Asn 


Gly 


Lys 


Val 


Leu 


Pro 


Gly 


Leu 






115 








120 










125 








lie 


Arg 


Arg 


Arg 


Glu 


Ala 


Glu 


Arg 


Val 


Leu 


Phe 


Glu 


Lys 


Leu 


Gly 


Ala 




130 






135 










140 











<210> 3 
<211> 191 
<212> PRT 

<213> Serratia entomophila 
<220> 

<223> ORF2 amino acid sequence encoding an insecticidal protein when 
linked with at least SEQ ID NO: 1 



<400> 3 



Met 


Ser 


Pro 


Ser 


Pro 


Leu 


Thr 


Gly 


Ala 


Ala 


Leu 


Met 


Glu 


Thr 


Lys 


Met 


1 








5 










10 










15 




Lys 


He 


His 


Tyr 


Gin 


Val 


Ala 


Ala 


Val 


Val 


Leu 


Thr 


Gly 


Val 


Met 


Val 






20 










25 










30 






Trp 


Gly 


Leu 


Ser 


His 


Trp 


Arg 


Tyr 


Thr 


Val 


Gly 


Tyr 


His 


Ala 


Ala 


Asp 


35 










40 










45 








Thr 


Gin 


Trp 


Gin 


Gin 


Arg 


Gin 


Ala 


Glu 


Gin 


Glu 


Arg 


Ala 


Asp 


Ala 


Leu 




50 








55 










60 






Gin 




Ala 


Leu 


Leu 


Ala 


Ala 


Glu 


Thr 


Arg 


Glu 


Arg 


Lys 


Trp 


Glu 


Gin 


Arg 


65 










70 










75 










80 


Gin 


Thr 


Asp 


Met 


Asn 


Lys 


Val 


Ala 


He 


His 


Ala 


Glu 


Glu 


Glu 


Leu 


Ala 








85 










90 










95 




Ala 


Ala 


Arg 


Asp 


Ala 


Ala 


Ala 


Asp 


Ala 


Gin 


Arg 


Thr 


Gly 


Gin 


Arg 


Leu 






100 










105 










110 




Thr 


Gin 


His 


Thr 


Val 


Thr 


Thr 


Leu 


Gin 


Arg 


Gin 


Leu 


Ala 


Ser 


Arg 


Glu 






115 










120 










125 






Gly 


Arg 


Arg 


Leu 


Ser 


Ala 


Ala 


Thr 


Ala 


He 


Gly 


Thr 


Asp 


Asp 


Leu 


Gly 


130 










135 










140 










Gin 


Pro 


Gly 


Val 


Leu 


Phe 


Ala 


Glu 


Leu 


Phe 


Arg 


Arg 


Ala 


Asp 


Gin 


Arg 


145 








150 










155 










160 


Ala 


Gly 


Glu 


Leu 


Ala 


Ala 


Tyr 


Ala 


Asp 


Arg 


Thr 


Arg 


Val 


Lys 


Trp 


Gin 








165 










170 










175 




Ala 


Cys 


Gly 


Arg 


Ala 


Tyr 


Gin 


Ala 


Ala 


Thr 


His 


Glu 


Ala 


Glu 


Lys 





180 " 185 190 
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<210> 4 

<211> 2376 

<212> PRT 

<213> Serratia entomophila 



<220> 

<223> SepA amino acid sequence encoding an insecticidal protein when 
linked with at least SEQ ID NO : 1 

<400> 4 

Met Arg Gin Asp lie Met Tyr Asn He Asp Asp He Leu Glu Lys Val 

15 10 15 

Asn Ala Pro Arg Ala Arg Leu Ser Glu Glu Asn Asp Thr Ala Val Thr 

20 25 30 

Leu Thr Asp Leu Phe Ser Arg Ser Phe Pro Glu Val Lys Lys He Thr 

35 40 45 

Gly Asp Ser Leu Ser Trp Gly Glu Val Cys Tyr Leu Tyr Ser Gin Ala 

50 55 60 

Gin His Glu Gin Lys Glu Asn Arg Leu Thr Glu Ser Arg lie Leu Ala 
65 70 75 80 

Arg Ala Asn Pro Leu Leu Val Asn Ala Val Arg Leu Gly He Arg Gin 

85 90 95 

Ala Ala Gly Ser Arg Ser Tyr Asp Asp Trp Phe Gly Ser Arg Ala Asp 

100 105 110 

Arg Phe Ala Arg Pro Gly Ser Val Ala Ser Met Phe Ser Pro Ala Ala 

115 " "* 120 125 

Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asp Leu His Pro Asp Thr 

130 ~ 135 140 

Ser Leu Phe Arg Leu Asp He Arg Arg Pro Asp Leu Ala Ala Leu Ala 
145 ~ 150 155 160 

Leu Ser Gin Asn Asn Met Asp Asp Glu Leu Ser Thr Leu Ser Leu Ser 

165 170 175 

Asn Glu Leu Leu Tyr Arg Gly He Gly Ala Ala Glu Gly Leu Asp Asp 

180 185 190 

Asp Ser Val Arg Glu Leu Leu Ala Gly Tyr Arg Leu Thr Gly Leu Thr 

195 ~ 200 205 

Pro Tyr His Trp Ala Tyr Glu Ala Ala Arg Gin Ala He Leu Val Gin 

210 ~ 215 220 

Asp Pro Thr Leu Met Gly Phe Ser Arg Asn Pro Asp Val Ala Gin Leu 
225 230 235 240 

Met Asp Pro Ala Ser Met Leu Ala He Glu Ala Asp He Ser Pro Glu 

245 250 255 

Leu Tyr Gin He Leu Ala Glu Glu He Thr Thr Asp Ser Tyr Glu Ala 

260 265 270 

Leu Trp Ser Lys Asn Phe Gly Asp Met Pro Pro Ser Ser Leu Leu Ser 

275 280 285 

Tyr Asp Ala Leu Ala Thr Phe Tyr Asp Leu Asp Tyr Asp Glu Leu Thr 

290 295 300 

Ser Leu Leu Ser Leu Arg Leu Asp Phe Ser Asn Pro Asn Asn Glu Tyr 
305 310 315 320 

Tyr He Asn Ser Gin Leu Ser Val Val Thr Leu Asn Glu Ser Thr Gly 

325 330 335 

Leu He Thr He His His Tyr Leu Arg Thr Leu Gly Gly Asp Ser Gin 

340 345 350 

Gin He Asn Pro Glu Leu He Pro Tyr Gly Asp Gly Thr Tyr Leu Tyr 

355 360 365 

Asn Phe Ser Val Val Ser Thr He Ser Glu Asp Ser Phe Lys Leu Gly 

370 375 380 

Ser Leu Gly Ser Asn Ser Ser Asn Leu Tyr Ser Gly Asp Tyr Gin Leu 
385 390 395 400 

Gin Lys Gly Val Arg Tyr Ser He Pro Val Glu He Asp Glu Gly Lys 

405 410 415 

Leu Asn Asp Gly He Thr He Gly Leu Ser Arg Lys Gly Gly Gly Tyr 

420 425 430 

Tyr Ser Thr Val Asn Phe Thr Leu He Glu Tyr Asp Pro Ala He Phe 
435 440 445 
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lie Leu Lys Leu 
450 

Thr Thr Ala Glu 
465 

Thr lie Asp His 

Met Arg His Tyr 
500 

Gly Thr lie Ser 
515 

Thr Leu Phe Asn 
530 

Asp Thr Pro Leu 
545 

Leu Ser Val Leu 

Thr Leu Trp Gin 
580 

Ser Ala Asp Asn 
595 

lie His Asp Leu 
610 

Pro Phe Ser Gly 

625 

Gin Phe Leu Tyr 

Val Ser Asp Val 
660 

Thr Pro Asp lie 
675 

Gly Arg Glu Leu 
690 

lie Ala Ala Ala 
705 

Leu Thr Trp Ala 

Phe lie Leu Leu 
740 

Gin Met Ala Gly 
755 

Arg Ser Thr Gly 
770 

Pro Gly Arg Phe 
785 

Ala Leu Arg Asp 

Ser His Ala Gly 
820 

Ser Ala Leu Leu 
835 

Gly Ala Leu Ala 
850 

Phe Thr Ser Trp 
865 

Ser Glu Thr Leu 

Leu Lys Tyr lie 
900 

Trp Gin Val Val 
915 

Ser Ser Ala Leu 
930 

Cys Ala Tyr Tyr 
945 

Asp Asp Leu Phe 
Val Lys Thr Thr 



Asn Lys Val lie 
455 

lie Tyr Gin lie 
470 

Ala Val Leu Ser 
485 

Gin Leu Asp Val 

Asp Gin Ala Phe 
520 

Thr Pro Pro Leu 
535 

Asp Leu Arg Ser 
550 

Lys Arg Ala Phe 
565 

Leu Ala Ser Gly 

lie Ala Ala Leu 
600 

Ser Ala Gly Glu 
615 

Val Ala Ala Gly 
630 

Gin Thr Thr Thr 
645 

Phe Leu Met Leu 

Glu Asn Leu Leu 
680 

Phe Pro Glu Thr 
695 

Met Gin Leu Asp 
710 

Asp Gin Leu Lys 
72 5 

Val Met Asn Ala 

Phe Cys Gin Ala 
760 

Leu Ser Thr Arg 
775 

Arg Thr Gly Trp 
790 

lie Thr Arg Phe 
805 

Glu Val Leu Thr 

Ala Arg Ala Leu 
840 

Gin Val Arg Gly 
855 

Glu Glu Val Asp 
870 

Ser lie Thr Pro 
885 

Asn Val Ser Asp 

Ser Gly Leu Leu 
920 

His Asp Tyr Leu 
935 

Leu Arg Asn Leu 
950 

Gly Tyr Leu Leu 
965 

Arg lie Ala Glu 



Arg Leu Tyr Lys 
460 

Thr Asn lie Leu 
475 

Lys He Phe Leu 
490 

Ala Arg Ser Leu 
505 

Ser Gly Glu Thr 

Asn Gly Gin Leu 
540 

Glu Ala Pro Glu 
555 

Asn He Ser Ala 
570 

Asp Ser Ser Ala 
585 

Tyr Arg Val Lys 

Leu Ser Met Leu 
620 

Ser Leu Ser Asp 
635 

Trp Leu Thr Glu 
650 

Thr Thr Gin Tyr 
665 

Ala Ser Leu Arg 

Leu Pro Gly Asp 
700 

Ala Thr Asp Thr 
715 

Pro Glu Gly Leu 
730 

Ala Pro Asn Asp 
745 

Leu Trp Gin Leu 

Glu Leu Thr Leu 
780 

His His Leu Pro 
795 

His Ala Val Val 
810 

Ala Leu Glu Thr 
825 

Ser Gin Asn Glu 

Ala Gly Glu Gin 
860 

Gin Ala Glu Gin 
875 

Ser Gly Leu Ala 
890 

Asp Ser Ala Pro 
905 

Gin Ala Gly Leu 

Glu Glu Gly Thr 
940 

Ala Pro Asn Met 
955 

Leu Asp Asn Gin 
970 

Ala lie Ala Gly 



Ala Thr Gly Met 

Asn Asn Gly Leu 
480 

Val Arg Tyr Leu 
495 

He Leu Cys Asn 
510 

Gly Leu Phe Thr 
525 

Phe Ser Ala Asp 

Asp Ala Phe Arg 
560 

Ser Gly Leu Ser 
575 

Gly Phe Ser Cys 
590 

Leu Leu Ala Asp 
605 

Leu Ser Val Ser 

Asn Glu Leu Thr 
640 

Gin Gly Trp Thr 
655 

Gly Thr Leu Leu 
670 

Asn Gly Leu Ser 
685 

Gly. Ala Pro Phe 

Ala Lys Ala Met 
720 

Thr Leu Thr Glu 
735 

Glu Gin Ala Gly 
750 

Ala Leu He He 
765 

Leu Val Ser Gin 

His Asp Leu Pro 
800 

Asn Arg Ser Gly 
815 

Gly Glu Leu Ser 
830 

Gin Asp Val Thr 
845 

Asp Asn Ser Val 

Trp Leu Asp Met 
880 

Ser Leu He Ala 
895 

Leu Tyr Ser Gin 
910 

Lys Ser Ser Gin 
925 

Ser Ser Ala Leu 

Val Ser Gly Arg 
960 

Val Ser Ala Lys 
975 

He Arg Leu Tyr 
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980 985 990 

lie Asn Arg Ala Leu Asn Gly lie Glu Leu Ser Ala Met Ala Glu Val 

995 1000 1005 

Arg Gly Arg Gin Phe Phe Thr Asp Trp Asp Thr Phe Asn Lys Arg Tyr 

1010 1015 1020 

Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr Tyr Pro Glu Asn Tyr 
1025 1030 1035 1040 

Leu Asp Pro Thr Val Arg He Gly Gin Thr Gly Met Met Asp Thr Leu 

1045 1050 1055 

Leu Gin Ser Val Ser Gin Ser Ser He Asn Arg Asp Thr Val Glu Asp 

1060 1065 1070 

Ala Phe Lys Thr Tyr Leu Thr Thr Phe Glu Gin He Ala Asn Leu Asn 

1075 * 1080 1085 

Thr Val Ser Gly Tyr His Asp Asn Ala Ser Met Thr Gin Gly Thr Thr 

1090 1095 1100 

Trp Tyr Val Gly Arg Ser He Thr Asp Gin Thr Asn Trp Tyr Trp Arg 
1105 HIO HIS H20 

Ser Ala Asn His Ser Lys He Gin Asp Ser Met Met Pro Ala Asn Ala 

1125 1130 1135 

Trp Thr Gly Trp Thr Lys He Asn Cys Gly Met Asn Pro Trp Ser Asp 

1140 " 1145 1150 

Leu Val Cys Ser Val Phe Phe Asn Ser Arg Leu Tyr Val Val Trp Val 

1155 1160 1165 

Glu Glu Asn Gin Ser Ala Asp Thr Glu Ala Glu Ser Thr Thr Thr Thr 

1170 1175 1180 

Gin Gin Ser Tyr Thr Leu Lys Leu Ser Phe Arg Arg Tyr Asp Gly Thr 
1185 H90 H95 1200 

Trp Ser Ser Pro Val Ser Phe Asp He Thr Gly Asn He Ala Phe Pro 

1205 1210 1215 

Glu Thr Gin Gly Met His Val Thr Cys Asn Pro Leu Thr Glu Gin Leu 

1220 1225 1230 

Tyr Cys Ala Phe Tyr Ser Val Thr Ser Lys Pro Asp Phe Asp Asn Ala 

1235 " 1240 1245 

Gin Leu He Ser Val Asp Asn Asp Met Thr Leu Asn Val He Ser Asp 

1250 1255 1260 

He Gly He Phe Lys Ser Val Ser His Glu Phe Asn Thr Ser Thr Glu 
1265 1270 1275 1280 

Lys Phe He Asn Asn Val Phe Ser Asp Pro Ser Ala Asn Tyr Phe Val 

1285 1290 1295 

Ser Ala Thr Ser Leu He Asp Asp Val He His Ser Asp Phe Ser Leu 

1300 1305 1310 

Leu Asn Ser Lys Thr Thr Ser Thr Val Phe Thr Asn Glu Asp Ser Ser 

1315 1320 1325 

Leu Leu Thr Pro Glu Leu His He Thr Ala Asn Val Ser Cys Phe Val 

1330 1335 1340 

Ser Thr Ala Gly He Ala Thr Gin Ser Thr He Glu Lys Phe Val Gin 
1345 ' 1350 1355 1360 

Ala Gly He Glu Phe Glu Glu He Asn Phe Tyr Ala Gly Gin Ala Ala 

1365 1370 1375 

Gly Gly Phe Asp Gly Phe Val Gly Val Asp Val Ser Asn Ser Lys Val 

1380 1385 1390 

Tyr Gin Val Gly Lys Glu Ala Val Gly Val Thr Val Lys Ser Tyr Ser 

1395 1400 1405 

Val Thr Gly Val Ser Gly Ser Val Glu Leu Phe He Asp Ser Ser Asn 

1410 1415 1420 

Lys Tyr Phe Ser Gly He Leu Ser Asp Lys Met He Thr Ala Leu He 
1425 1430 1435 1440 

Ser Gly Ser Thr Ser Lys Val Asn Tyr Val Ser Ser He Gly Ser Gin 

1445 1450 1455 

Asp Phe Trp Ser Val Lys Ser Leu Met Pro Ala Leu Gin He Tyr Glu 

1460 J 1465 1470 

Leu He Asp Asp He He Leu Thr Ser Gly Val Asn Gly Thr Glu He 

1475 1480 1485 

Lys Ser Trp Pro Ser Ala Glu Trp Tyr Asn Asp Lys Leu Ser Leu Gin 

1490 1495 1500 

Ser Gly Asn Asn Leu Phe Asn Thr Lys Ser Leu Ser Phe Thr Val Asn 
1505 1510 1515 1520 
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Thr Ser Asp lie Val Glu Asp Glu Phe Asp Val Thr Phe Thr Phe Thr 

1525 1530 1535 

Ala Val Asp Gin Asn Asn Val Val Leu Ala Ala Arg Thr Ala lie Leu 

1540 1545 1550 

Thr Val lie Arg Asn lie Asn Asn Asp Thr Ser Val lie Ala Leu Arg 

1555 1560 1565 

Lys Asn Thr Arg Gly Ala Gin Tyr He Arg Phe Thr Ala Gly Asn Asp 

1570 ^ 1575 1580 

Val Ala Leu He Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Asp 
1585 1590 1595 1600 

Arg Ala Asn Thr Gly lie Asp Thr He Leu Ser Met Glu Thr Gin Arg 

1605 1610 1615 

Leu Thr Glu Pro Ala Leu Glu Glu Gly Ser Asp Val Phe Met Asp Phe 

1620 1625 1630 

Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 

1635 1640 1645 

Met Met Val Phe Gin Arg Leu Leu Gin Glu Gin His Phe Pro Glu Ala 

1650 1655 1660 

Thr Arg Trp Leu Gin Tyr Val Trp Asn Pro Ala Gly His Val Val Asn 
1665 ~ 1670 1675 1680 

Gly Val Leu Gin Asn Tyr Thr Trp Asn Val Arg Pro Leu Glu Glu Asp 

1685 1690 1695 

Thr Gly Trp Asn Asp Ser Pro Leu Asp Ser lie Asp Pro Asp Ala He 

1700 1705 1710 

Ala Gin Tyr Asp Pro Met His Tyr Lys Val Ala Thr Phe Met Ser Tyr 

1715 1720 1725 

Leu Asp Leu Leu He Ala Arg Gly Asp Ala Ala Tyr Arg Leu Leu Glu 

1730 1735 1740 

Arg Asp Thr Leu Asn Glu Ala Arg Met Trp Tyr Val Gin Ala Leu Asn 
1745 1750 1755 1760 

Leu Leu Gly Asp Glu Pro Tyr He Ser Phe Asp Ala Asp Trp Ser Ala 

1765 1770 1775 

Leu Thr Leu Gly Asp Ala Ala Ser Glu Val Thr Arg Arg Asp Tyr Gin 

1780 1785 1790 

Glu Ala Leu Leu Ala Val Arg Arg Leu Val Pro Ala Pro Glu Thr Arg 

1795 1800 1805 

Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Gin Asn Glu Val 

1810 1815 1820 

Leu Lys Gly Tyr Trp Gin Thr Leu Ala Gin Arg Leu His Asn Leu Arg 
1825 1830 1835 1840 

His Asn Leu Ser He Asp Gly Gin Pro Leu Ser Leu Ser Val Tyr Ala 

1845 1850 1855 

Thr Pro Ser Glu Pro Ser Ala Leu Gin Ser Ala Val Val Asn Ser Ala 

1860 1865 1870 

Gin Gly Ala Ala Ala Leu Pro Ala Ala Val Met Pro Leu Tyr Ser Phe 

1875 1880 1885 

Pro Val Met Leu Glu Asn Ala Arg Gly Met Val Ser Leu Leu Thr Gly 

1890 1895 1900 

Phe Gly Asn Thr Leu Leu Gly He Thr Glu Arg Gin Asp Ala Glu Ala 
1905 1910 1915 1920 

Leu Ala Lys Leu Leu Gin Thr Gin Gly Ser Glu Leu He Arg Gin Gly 

1925 1930 1935 

Leu Arg Gin Gin Asp Asn Val Leu Glu Glu He Asp Ala Asp He Ala 

1940 1945 1950 

Ala Leu Glu Glu Ser Arg Arg Gly Ala Gin Met Arg Phe Glu Arg Tyr 

1955 I960 1965 

Lys Val Leu Tyr Glu Ala Asp Val Asn Thr Gly Glu Lys Gin Ala Met 

1970 1975 1980 

Asp Leu Tyr Leu Ser Ser Ser Val Leu Ser Ala Ser Thr Ala Ala Leu 
1985 ' 1990 1995 2000 

Phe Leu Ala Glu Ala Ala Ala Asp Met Leu Pro Asn He Tyr Gly Leu 

2005 2010 2015 

Ala Val Gly Gly Ser Arg Tyr Gly Ala Leu Phe Lys Ala Thr Ala He 

2020 2025 2030 

Gly He Gin Val Ser Ser Asp Ala Thr Arg He Ser Ala Asp Lys He 

2035 2040 2045 

Ser Gin Ser Glu Val Tyr Arg Arg Arg Arg Glu Glu Trp Glu He Gin 
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2050 2055 2060 

Arq Asp Ser Ala Gin Ser Asp Val Ala Gin He Asp Ala Gin Leu Ala 
2065 2070 2075 2080 

Ala Met Ala Val Arg Arg Glu Gly Ala Glu Leu Gin Lys Thr Tyr Leu 

2085 2090 2095 

Glu Thr Gin Gin Thr Gin Ala Gin Ala Gin Leu Ala Phe Leu Gin Ser 

2100 2105 2110 

Lys Phe Asn Asn Thr Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser 

2115 • 2120 2125 

Ala He Tyr Tyr Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met 

2130 " 2135 2140 

Ala Gin Gin Ala Trp Gin Trp Asp Lys Phe Glu Thr Arg Ser Phe lie 
2145 2150 2155 2160 

Gin Pro Gly Ala Trp Met Gly Ala Asn Ala Gly Leu Leu Ala Gly Glu 

2165 2170 2175 

Thr Leu Met Leu Asn Leu Ala Gin Met Glu Gin Ala Trp Leu Thr Gly 

2180 2185 2190 

Asp Glu Arg Ala lie Glu Val Thr Arg Thr Val Cys Leu Ser Glu Val 

2195 2200 2205 

Tyr Thr Ser Leu Ala Glu Asp Ala Ala Phe Ser Leu Ala Asp Lys Val 

2210 2215 2220 

Val Glu Leu Val Ser Asn Gly Ser Gly Ser Ala Gly Thr Lys Ser Asn 
2225 2230 2235 2240 

Gly Leu Gin Met Asp Gin Gin Gin Leu Glu Ala Thr Leu Lys Leu Ala 

2245 2250 2255 

Asp Leu Gly He Gly Asn Asp Tyr Pro Val Ser Leu Gly Thr Met Arg 

2260 2265 2270 

Arg He Lys Gin He Ser Val Thr Leu Pro Ala Leu Val Gly Pro Tyr 

2275 2280 2285 

Gin Asp Val Arg Ala Val Leu Ser Tyr Gly Gly Ser Met Val Met Pro 

2290 ~ 2295 2300 

Arg Gly Cys Ser Ala Leu Ala Val Ser His Gly Met Asn Asp Ser Gly 
2305 2310 2315 2320 

Gin Phe Gin Leu Asp Phe Asn Asp Pro Arg Tyr Leu Pro Phe Glu Gly 

2325 2330 2335 

Leu Pro Val Asp Asp Thr Gly Thr Leu Thr Leu Ser Phe Pro Asp Ala 

2340 2345 2350 

Asp Gly Lys Gin Gin Ala Met Leu Leu Ser Leu Ser Asp He He Leu 

2355 2360 2365 

His He Arg Tyr Thr He He Ser 
2370 2375 
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Thr 






Ser 


Met 


Gin 


Asn 


His 


Gin 


Asp 


Met 


Ala 


He 


Thr 


Ala 


Pro 


Leu 


Pro 


1 








5 








10 










15 




Gly 


Gly 


Gly 


Ala 


Val 


Thr 


Gly 


Leu 


Lys 


Gly Asp 


He 


Ala 


Ala 


Ala 


Gly 


20 










25 










30 




Gly 


Pro 


Asp 


Gly 


Ala 


Ala 


Thr 


Leu 


Ser 


He 


Pro 


Leu 


Pro 


Val 


Ser 


Pro 




35 










40 










45 






Gly 


Arg 


Gly 


Tyr 


Ala 


Pro 


Thr 


Gly 


Ala 


Leu 


Asn 


Tyr 


His 


Ser 


Arg 


Ser 


50 








55 










60 










Asn 


Gly 


Pro 


Phe 


Gly 


He 


Gly 


Trp 


Gly 


He 


Gly 


Gly 


Ala 


Ala 


Val 


Gin 


65 






70 










75 








Glu 


80 


Arg 


Arg 


Thr 


Arg 


Asn 


Gly 


Ala 


Pro 


Thr 


Tyr 


Asp 


Asp 


Thr 


Asp 


Phe 




85 










90 










95 




Thr 


Gly 


Pro 


Asp 


Gly 


Glu 


Val 


Leu 


Val 


Pro 


Ala 


Leu 


Thr 


Ala 


Ala 


Gly 






100 










105 










110 
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Thr Gin Glu Ala 
115 

Gly Ser Phe Asn 
130 

Ser Arg Leu Glu 
145 

Trp Val Leu Tyr 

Ala Gin Ala Arg 
180 

Trp Leu Met Glu 
195 

Gin Tyr Arg Ala 
210 

Ala His Pro Gin 
225 

Gly Asn Arg Gin 

Ser Met Asp Ser 
260 

Ser Ser Val Leu 
275 

Glu Trp Leu Cys 
290 

Asn Leu Arg Thr 
305 

Leu Gly Val Leu 

lie Ser Arg Leu 
340 

Leu Glu Asn Val 
355 

Ala Leu Pro Ala 
370 

Leu Ser Ala Trp 
385 

Gin Pro Tyr Gin 

Leu Tyr Gin Asp 
420 

Ser Gly Asp Asp 
435 

Pro Thr Met Pro 
450 

Gly Asp Gly Arg 
465 

Met Tyr Asp Arg 

Ser Ala Leu Pro 
500 

lie Leu Gly Ala 
515 

Val Arg Leu Tyr 
530 

Val Gin Gin Thr 
545 

Arg Thr Leu Val 

Leu Thr Glu Val 
580 

His Gly Arg Phe 
595 

Val Thr Thr Phe 
610 

Ser Gly Thr Thr 

625 

Tyr Phe Asn Gin 



Arg Gin Ala Thr 
12 0 

Val Gin Val Tyr 
135 

Arg Trp Leu Pro 
150 

Thr Pro Asp Gly 
165 

lie Ser Asn Pro 

Ser Ser Val Ser 
200 

Glu Asp Asp Asp 
215 

Ala Gly Ala Gin 

230 

Ala Ala Arg Thr 
245 

Trp Leu Phe lie 

Ser Glu Ala Pro 
280 

Arg Gin Asp Cys 
2 95 

Arg Arg Leu Cys 
310 

Ala Gly Ser Ser 
325 

Leu Leu Asp Tyr 

His Gin Val Ala 
360 

Leu Ala Leu Gly 
375 

Gin Thr Arg Asp 
390 

Leu Val Asp Leu 
405 

Ser Gly Ala Trp 

Pro Asp Ala Val 
440 

Ala Leu His Asn 
455 

Leu Glu Trp Val 
470 

Thr Pro Gly Arg 
485 

Val Glu Tyr Ala 

Gly Leu Thr Asp 
520 

Ser Gly Lys Asn 
53 5 

Glu Arg Leu Thr 
550 

Ala Phe Ser Asp 
565 

Arg Ala Asn Gly 

Gly Gin Pro Val 
600 

Asn Pro Asp Gin 
615 

Asp Leu lie Tyr 
630 

Ser Gly Asn Tyr 



Ser Leu Leu Gly 

Arg Ser Arg Thr 
140 

Ala Asp Glu Thr 
155 

Gin Val Ala Leu 
170 

Thr Ala Pro Thr 
185 

Leu Thr Gly Glu 

Gly Cys Asp Glu 
220 

Arg Tyr Pro Val 
23 5 

Leu Pro Ala Leu 
250 

Leu Val Phe Asp 
265 

Ala Trp Gin Thr 

Phe Ser Gly Tyr 
300 

Arg Gin Val Leu 
315 

Gly Ala Asn Asp 
330 

Arg Glu Ser Pro 
345 

Tyr Glu Ser Asp 

Trp Gin Thr Phe 
380 

Asp Met Gly Lys 
395 

Asn Gly Glu Gly 
410 

Trp Tyr Arg Glu 
425 

Thr Trp Gly Ala 

Ser Gly lie Leu 
460 

Val Thr Ala Pro 
475 

Asp Trp Leu His 
490 

His Pro Lys Ala 
505 

Met Val Leu lie 

Asp Gly Trp Asn 
540 

Leu Pro Val Pro 
555 

Met Ala Gly Ser 
570 

Val Arg Tyr Trp 
585 

Asn lie Pro Gly 

lie Leu Leu Ala 
620 

Ala Met Ser Asp 
635 

Phe Ala Glu Pro 



lie Asn Pro Gly 
125 

Glu Gly Ser Leu 

Glu Thr Glu Phe 
160 

Leu Gly Arg Asn 
175 

Gin Thr Ala Val 
190 

Gin Met Tyr Tyr 
205 

Ala Glu Arg Asp 

Ala Val Trp Tyr 
240 

Val Ser Thr Pro 
255 

Tyr Gly Glu Arg 
270 

Pro Gly Ser Gly 
285 

Glu Phe Gly Phe 

Met Phe His Tyr 
320 

Ala Pro Ala Leu 
335 

Ser Leu Ser Leu 
350 

Gly Thr Ser Cys 
365 

Thr Pro Pro Thr 

Leu Ser Leu Leu 
400 

Val Val Gly lie 
415 

Pro Val Arg Gin 
430 

Ala Ala Ala Leu 
445 

Ala Asp Leu Asn 

Gly Val Ala Gly 
480 

Phe Thr Pro Leu 
495 

Val Leu Ala Asp 
510 

Gly Pro Arg Ser 
525 

Lys Gly Glu Thr 

Gly Val Asp Pro 
56 0 

Gly Gin Gin His 
575 

Pro Asn Leu Gly 
590 

Phe Ser Gin Ser 
605 

Asp Thr Asp Gly 

Arg Leu Val lie 
64 0 

His Thr Leu Leu 
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645 650 655 

Leu Pro Lys Gly Val Arg Tyr Asp Arg Thr Cys Ser Leu Gin Val Ala 

660 665 670 

Asp He Gin Gly Leu Gly Val Pro Ser Leu Leu Leu Thr Val Pro His 

675 ' 680 685 

Val Ala Pro His His Trp Val Cys His Leu Ser Ala Asp Lys Pro Trp 

690 695 700 

Leu Leu Asn Gly Met Asn Asn Asn Met Gly Ala Arg His Ala Leu His 
705 710 715 720 

Tyr Arq Ser Ser Val Gin Phe Trp Leu Asp Glu Lys Ala Glu Ala Leu 

725 ~ 730 735 

Ala Ala Gly Ser Ser Pro Ala Cys Tyr Leu Pro Phe Thr Leu His Thr 

740 745 750 

Leu Trp Arg Ser Val Val Gin Asp Glu He Thr Gly Asn Arg Leu Val 

755 760 ' 765 

Ser Asp Val Leu Tyr Arg His Gly Val Trp Asp Gly Gin Glu Arg Glu 

770 775 780 

Phe Arg Gly Phe Gly Phe Val Glu lie Arg Asp Thr Asp Thr Leu Ala 
785 ~ " 790 795 800 

Ser Gin Gly Thr Ala Thr Glu Leu Ser Met Pro Ser Val Ser Arg Asn 

805 810 815 

Trp Tyr Ala Thr Gly Val Pro Ala Val Asp Glu Arg Leu Pro Glu Thr 

820 825 830 

Tyr Trp Gin Asn Asp Ala Ala Ala Phe Ala Asp Phe Ala Thr Arg Phe 

835 840 845 

Thr Val Gly Ser Gly Glu Asp Glu Gin Thr Tyr Thr Pro Asp Asp Ser 

850 855 860 

Lys Thr Phe Trp Leu Gin Arg Ala Leu Lys Gly lie Leu Leu Arg Ser 
865 870 875 880 

Glu Leu Tyr Gly Ala Asp Gly Ser Ser Gin Ala Asp He Pro Tyr Ser 

885 890 895 

Val Thr Glu Ser Arg Pro Gin Val Arg Leu Val Glu Ala Asn Gly Asp 

900 905 910 

Tyr Pro Val Val Trp Pro Met Gly Ala Glu Ser Arg Thr Ser Val Tyr 

915 ~ 92 0 92 5 

Glu Arg Tyr His Asn Asp Pro Gin Cys Gin Gin Gin Ala Val Leu Leu 

930 935 940 

Ser Asp Glu Tyr Gly Phe Pro Leu Arg Gin Val Ser Val Asn Tyr Pro 
945 1 ^ 950 955 960 

Arg Arg Pro Pro Ser Ala Asp Asn Pro Tyr Pro Ala Ser Leu Pro Ala 

965 970 975 

Thr Leu Phe Ala Asn Ser Tyr Asp Glu Gin Gin Gin lie Leu Arg Leu 

980 985 990 

Gly Leu Gin Gin Ser Ser Ala His His Leu Val Ser Leu Ser Glu Gly 

995 1000 1005 

His Trp Leu Leu Gly Leu Ala Glu Ala Ser Arg Asp Asp Val Phe Thr 

1010 1015 1020 

Tyr Ser Ala Asp Asn Val Pro Glu Gly Gly Leu Thr Leu Glu His Leu 
1025 1030 1035 1040 

Leu Ala Pro Glu Ser Leu Val Ser Asp Ser Gin Val Gly Thr Leu Ala 

1045 1050 1055 

Gly Gin Gin Gin Val Trp Tyr Leu Asp Ser Gin Asp Val Ala Thr Val 

1060 1065 1070 

Ala Ala Pro Pro Leu Pro Pro Lys Val Ala Phe He Glu Thr Ala Val 

1075 1080 1085 

Leu Asp Glu Gly Met Val Ser Ser Leu Ala Ala Tyr He Val Asp Glu 

1090 1095 1100 

His Leu Glu Gin Ala Gly Tyr Arg Gin Ser Gly Tyr Leu Phe Pro Arg 
1105 1110 1115 1120 

Gly Arg Glu Ala Glu Gin Ala Leu Trp Thr Gin Cys Gin Gly Tyr Val 

1125 1130 1135 

Thr Tyr Ala Gly Ala Glu His Phe Trp Leu Pro Leu Ser Phe Arg Asp 

1140 1145 1150 

Ser Met Leu Thr Gly Pro Val Thr Val Thr Arg Asp Ala Tyr Asp Cys 

1155 1160 1165 

Val He Thr Gin Trp Gin Asp Ala Ala Gly He Val Thr Thr Ala Asp 
1170 1175 1180 
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Tyr Asp Trp Arg Phe Leu Thr Pro Val Arg Val Thr Asp Pro Asn Asp 
1185 " H90 1195 1200 

Asn Leu Gin Ser Val Thr Leu Asp Ala Leu Gly Arg Val Thr Thr Leu 

1205 1210 1215 

Arq Phe Trp Gly Thr Glu Asn Gly He Ala Thr Gly Tyr Ser Asp Ala 

1220 1225 1230 

Thr Leu Ser Val Pro Asp Gly Ala Ala Ala Ala Leu Ala Leu Thr Ala 

1235 1240 1245 

Pro Leu Pro Val Ala Gin Cys Leu Val Tyr Val Thr Asp Ser Trp Gly 

1250 1255 1260 

Asp Asp Asp Asn Glu Lys Met Pro Pro His Val Val Val Leu Ala Thr 
1265 1270 1275 1280 

Asp Arg Tyr Asp Ser Asp Thr Gly Gin Gin Val Arg Gin Gin Val Thr 

1285 ^ 1290 1295 

Phe Ser Asp Gly Phe Gly Arg Glu Leu Gin Ser Ala Thr Arg Gin Ala 

1300 1305 1310 

Glu Gly Asn Ala Trp Gin Arg Gly Arg Asp Gly Lys Leu Val Thr Ala 

1315 1320 1325 

Ser Asp Gly Leu Pro Val Thr Val Ala Thr Asn Phe Arg Trp Ala Val 

1330 1335 1340 

Thr Gly Arg Ala Glu Tyr Asp Asn Lys Gly Leu Pro Val Arg Val Tyr 
1345 1350 1355 1360 

Gin Pro Tyr Phe Leu Asp Ser Trp Gin Tyr Val Ser Asp Asp Ser Ala 

1365 1370 1375 

Arg Gin Asp Leu Tyr Ala Asp Thr His Phe Tyr Asp Pro Thr Ala Arg 

1380 1385 1390 

Glu Trp Gin Val He Thr Ala Lys Gly Glu Arg Arg Gin Val Leu Tyr 

1395 1400 1405 

Thr Pro Trp Phe Val Val Ser Glu Asp Glu Asn Asp Thr Val Gly Leu 

1410 1415 1420 

Asn Asp Ala Ser 
1425 
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Met 


Ser 


Thr 


Ser 


Leu 


Phe 


Ser 


Ser 


Thr 


Pro 


Ser 


Val 


Ala 


Val 


Leu 


Asp 


1 








5 










10 










15 




Asn 


Arg 


Gly 


Leu 


Leu 


Val 


Arg 


Glu 


Leu 


Gin 


Tyr 


Tyr 


Arg 


His 


Pro 


Asp 




20 










25 










30 






Thr 


Pro 


Glu 


Glu 


Thr 


Asp 


Glu 


Arg 


He 


Thr 


Cys 


His 


Gin 


His 


Asp 


Glu 






35 










40 










45 






Gly 


Arg 


Gly 


Ser 


Leu 


Ser 


Gin 


Ser 


Ala 


Asp 


Pro 


Arg 


Leu 


His 


Ala 


Ala 


50 










55 










60 










Leu 


Thr 


Asn 


Phe 


Thr 


Tyr 


Leu 


Asn 


Ser 


Leu 


Thr 


Gly 


Thr 


Val 


Leu 


Gin 


65 










70 










75 










80 


Ser 


Val 


Ser 


Ala 


Asp 
85 


Ala 


Gly 


Thr 


Ser 


Leu 
90 


Glu 


Leu 


Ser 


Asp 


Ala 
95 


Ala 


Gly 


Arg 


Ala 


Phe 


Leu 


Ala 


Val 


Thr 


Gly 


Ala 


Gly 


Thr 


Glu 


Asp 


Ala 


Val 




100 










105 










110 






Thr 


Arg 


Thr 


Trp 


Gin 


Tyr 


Glu 


Asp 


Asp 


Thr 


Leu 


Pro 


Gly 


Arg 


Pro 


Leu 




115 








120 










125 








Ser 


He 
130 


Thr 


Glu 


Gin 


Val 


Thr 
135 


Gly 


Glu 


Ala 


Ala 


Gin 
140 


He 


Thr 


Glu 


Arg 


Phe 


Val 


Tyr 


Ala 


Gly 


Asn 


Thr 


Asp 


Ala 


Glu 


Lys 


He 


Leu 


Asn 


Leu 


Ala 


145 






150 










155 










160 


Gly 


Gin 


Cys 


Val 


Ser 


His 


Tyr 


Asp 


Thr 


Ala 


Gly 


Leu 


Val 


Gin 


Thr 





165 170 175 
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Ser lie Ala Leu Ser Gly Val Pro Leu Ala Val Thr Arg Gin Leu Leu 

180 185 190 

Pro Asp Ala Ala Gly Ala Asn Trp Met Gly Glu Asp Ala Ser Ala Trp 

195 " 200 205 

Asn Asp Leu Leu Asp Gly Glu Thr Phe Phe Thr Gin Thr His Ala Asp 

210 ** 215 220 

Ala Thr Gly Ala Val Leu Ser lie Thr Asp Ala Lys Gly Asn Leu Gin 
225 * 230 235 240 

Arg Val Ala Tyr Asp Val Ala Gly Leu Leu Ser Gly Ser Trp Leu Thr 

245 250 255 

Leu Lys Asp Gly Thr Glu Gin Val lie Val Ala Ser Leu Thr Tyr Ser 

260 265 270 

Ala Ala Gly Lys Lys Leu Arg Glu Glu His Gly Asn Gly Val Val Thr 

275 ~ 280 285 

Ser Tyr lie Tyr Glu Pro Glu Thr Gin Arg Leu Thr Gly lie Lys Thr 

290 "* 295 300 

Glu Arg Pro Ser Gly His Val Ala Gly Ala Lys Val Leu Gin Asp Leu 
305 " 310 315 320 

Arg Tyr Thr Tyr Asp Pro Val Gly Asn Val Leu Ser Val Asn Asn Asp 

325 330 335 

Ala Glu Glu Thr Arg Phe Trp Arg Asn Gin Lys Val Val Pro Glu Asn 

340 "* 345 350 

Thr Tyr lie Tyr Asp Ser Leu Tyr Gin Leu Val Ser Ala Thr Gly Arg 

355 " 360 365 

Glu Met Ala Asn Ala Gly Gin Gin Gly Asn Asp Leu Pro Ser Ala Thr 

370 375 380 

Ala Pro Leu Pro Thr Asp Ser Ser Ala Tyr Thr Asn Tyr Thr Arg Thr 
385 390 395 400 

Tyr Arg Tyr Asp Arg Gly Gly Asn Leu Thr Gin Met Arg His Ser Ala 

405 " 410 415 

Pro Ala Thr Asn Asn Asn Tyr Thr Thr Asp lie Thr Val Ser Asp Arg 

420 425 430 

Ser Asn Arg Ala Val Leu Ser Thr Leu Ala Glu Val Pro Ser Asp Val 

435 440 445 

Asp Met Leu Phe Ser Ala Gly Gly His Gin Lys His Leu Gin Pro Gly 

450 455 460 

Gin Ala Leu Val Trp Thr Pro Arg Gly Glu Leu Gin Lys Val Thr Pro 
465 470 475 480 

Val Val Arg Asp Gly Gly Ala Asp Asp Ser Glu Ser Tyr Arg Tyr Asp 

485 490 495 

Ala Gly Ser Gin Arg lie lie Lys Thr Gly Thr Arg Gin Thr Gly Asn 

500 505 510 

Asn Val Gin Thr Gin Arg Val Val Tyr Leu Pro Gly Leu Glu Leu Arg 

515 520 525 

lie Met Ala Asn Gly Val Thr Glu Lys Glu Ser Leu Gin Val lie Thr 

530 ' 535 540 

Val Gly Glu Ala Gly Arg Ala Gin Val Arg Val Leu His Trp Glu lie 
545 ~ 550 555 560 

Gly Lys Pro Asp Asp Leu Asp Glu Asp Ser Val Arg Tyr Ser Tyr Asp 

565 570 575 

Asn Leu Val Gly Ser Ser Gin Leu Glu Leu Asp Arg Glu Gly Tyr Leu 

580 585 590 

lie Ser Glu Glu Glu Phe Tyr Pro Tyr Gly Gly Thr Ala Val Leu Thr 

595 600 605 

Ala Arg Ser Glu Val Glu Ala Asp Tyr Lys Thr lie Arg Tyr Ser Gly 

610 615 620 

Lys Glu Arg Asp Ala Thr Gly Leu Asp Tyr Tyr Gly Tyr Arg Tyr Tyr 
625 630 635 640 

Gin Pro Trp Ala Gly Arg Trp Leu Ser Thr Asp Pro Ala Gly Thr Val 

645 650 655 

Asp Gly Leu Asn Leu Phe Arg Met Val Arg Asn Asn Pro Val Thr Leu 

660 665 670 

Phe Asp Ser Asn Gly Arg lie Ser Thr Gly Gin Glu Ala Arg Arg Leu 

675 680 685 

Val Gly Glu Ala Phe Val His Pro Leu His Met Pro Val Phe Glu Arg 

690 695 700 

lie Ser Val Glu Arg Lys lie Ser Met Ser Val Arg Glu Ala Gly lie 

31 
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705 










710 










715 








His 


720 


Tyr 


Thr 


Tip 
11c 




A "1 a 


Leu 


Gly 


Glu 


Gly 


Ala 


Ala 


Ala 


Lys 


Gly 


Asn 








725 










730 










735 


Gly 


He 


Leu 


bill 


Lys 


J. Ill 


lie 




Pro 


Gly 


Ser 


Leu 


Lys 


Ala 


He 


Tvr 


















745 










750 




Val 


Asp 


Lys 


A-La 




_ 

oer 


He 


Leu 


Gly 


Leu 


Ala 


Lys 


Arg 


Ser 


Glv 


Leu 


755 










760 










765 








Gly Arg 


va_L 


biy 




Trp 


Asp 


Ala 


Ser 


Gly Val 


Aro 


Glv 


He 


Tyr 


Ala 




770 










775 










780 










His 


Asn 


Arg 


Pro 


Gly 


Gly 


Glu 


Asp 


Leu 


val 


Tyr 


Pro 


Val 


Ser 


Leu 


Gin 


785 






790 










T n r 










800 


Asn 


Thr 


Ser 


Ala 


Asn 
8 0 5 


Glu 


He 


Val 


Asn 


Ala 

OlU 


Trp 


He 


Lys 


Phe 


Lys 
815 


He 


He 


Thr 


ir ro 


lyr 


Thr 


Gly Asp 


Tvr 
j 


Asp 


i v ie u 


TT -1 G 

hi s 


Asp 


He 


He 


Lys 


Phe 








820 










825 










830 






Ser 


Asp 


uiy 


Lys 




His 


Val 


Pro 


Thr 


Ala 


Glu 


Ser 


Ser 


Glu 


Glu 


Arg 




ft "3 c: 

O J D 








840 










845 








Gly 


Val 


Lys 


Asp 


Leu 


He 


Asn 


Lys 


Glv 


Val 


Ala 


Glu 


Val 


Asp 


Pro 


Ser 


850 








855 










860 










Arg 


Pro 


Jr Ilfc: 


kjl u. 


lyr 


Thr 


Ala 


Met 


Asn 


Val 


He 


Arq 


His 


Gly 


Pro 


Gin 


865 








870 










875 










880 


Val 


Asn 


Phe 


Val 


Pro 


Tyr 


Met 


Tru 


Glu 


His 


Glu 


His 


Asp 


Lys 


Val 


Val 










885 






890 










895 




Asn 


Asp 


Asn 


Gly 


Tyr 


Leu 


Gly 


Val 


Val 


Ala 


Ser 


Pro 


Gly 


Pro 


Phe 


Pro 






900 










905 










910 






Val 


Ala 


Met 


Val 


His 


Gin 


Gly 


Glu 


Trp 


Thr 


Val 


Phe 


Asp 


Asn 


Ser 


Glu 






915 










920 










925 






His 


Glu 


Leu 


Phe 


Asn 


Phe 


Tyr 


Lys 


Ser 


Thr 


Asn 


Thr 


Pro 


Leu 


Pro 


Glu 




930 








935 










940 










Trp 


Ser 


Gin 


Asp 


Phe 


Met 


Asp 


Arg 


Gly 


Lys 


Gly 


He 


Val 


Ala 


Thr 


Pro 


945 










950 










955 










960 


Arg 


His 


Ala 


Glu 


Leu 


Leu 


Asp 


Lys 


Arg 


Arg 


Val 


Met 


Tyr 









965 970 
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ggatccgagt gaaggaatca tcggccgctt tatacgtttc agggtgaata cggttggccg 60 
caacgtggca atggatgttg tttgtgtcgg tatgaatcgc cgcaacgtac tggtgttctg 120 
acatacccag tgccgataaa ctgtgacgaa cactatcaaa gatgtgttcc gtcgacctga 180 
aagccaggat ttatttttac accaatggtt gggtgggctt cctttctgaa ctggtgcatc 240 
atttagcdgg catcatcaaa agatgcatgg aaatacaaat atcatattta cagacaccca 300 
agttgatgac ctgctccgtg agttgaaatg ccgacggggg aaatcagcag ccttttcaac 360 
tcatggagca gggggaaatc aatcctcaat aacccgcatt ggatatcctg ccagtgtgca 420 
tttaaccttt ttagtgtgtt tccttaatat cccaatcgtt gaatcgctac atacggcaga 480 
cattagtatc tcacttatca tcaaagtaat atcacaccga gaatgctaat ttcatgatat 540 
gaaaacgttc cattaataaa ttttcagaaa cctaacacgg catttttatg ctgatcagtg 600 
- aattgattgt ttctgaaaaa attaattgca cctctgccac ttatcagata aaaacacccc 660 
atgcggtaag ttttttattt tttattaatg attttattaa tgattttatt aatgatttta 720 
ttaatgattt tattaatgat tttactatag atgaatgtta acatgggtga taatttactt 780 
tactcaattt aattgttggt atgaccatgt tttagatgag tggcacggat tcattattgt 840 
aaaaaaagta tctaaaacct ttagcagcaa tcctacttga ggatgacctc gacaggactt 900 
gattattgcc attttttacg aaggaagatg acgggtgata aataataaaa aaaacaaaag 960 
tatagcctta ggtatcgccg attacatcca gtaacactta ttgacttttt tttacttcta 1020 



ccgttagcta taaatatgat atttaaatct gtatttttat ataaaaccag tttatgatgc 1080 

tggattgg^tc attaaagtcg ttatatgtga tcgttatctg tcattgattg gtgtttaatc 1140 

ttttattctt ccagtgaggt ttcaggggga atgtattggg taatcatact catgtcattt 1200 

gttgctttga tgttaaatta acgtgttcat tcattatgtt ctactgttgt ttctattgtc 1260 

cggaacgacc atagagactg tcgctatgtt aataggaata tttgactggt tatatgcgcc 132 0 

aagggttatc gctgcactct ctggggcgat ggtattcatc attacgcaag ataacttcat 1380 

tggtgtcaga cgggtgttat tgttttttgt gtctttttta ctcggtttga cattttcaga 144 0 

gacaacagct tccgttatca acttctatat cccgaatgat atacatatag gaaatgacct 1500 

tggtgccttt gttaccagcg ccgtgacggt gaagcttttt gttatcatta tgagcaagat 1560 

agagagaaaa tatcttggag aataaccgcc atgttccaaa tcatacttct taatgttaat 162 0 

gccgtgattt gcttggctat tgccgtcaga ttattcctgt ggcgtatcaa tcataaaatg 1680 

aaaaacattg tcgtctcttt tattgctttt ctcattatta cggcgtgcgg cgctgtctcc 174 0 

atcaggacga tgacggggga gtattactat gcggattggt ccgagacgat cattaacctt 1800 

tcgcttttcc tgtctgttta tatacgcaat ggcgaaatcc ttcggtgggg ggagaaaaa 1859 
\ 

atg aag ata agt tec cga ggt ate gca tta ate aaa gag ttc gaa ggt 1907 
Met Lys lie Ser Ser Arg Gly He Ala Leu lie Lys Glu Phe Glu Gly 
15 10 15 

ctg cgc tta cac get tat cgc tgc gee get gac gtc tgg act gtc ggt 1955 
Leu Arg Leu His Ala Tyr Arg Cys Ala Ala Asp Val Trp Thr Val Gly 
20 25 30 



tat ggc cac acg gca ggg gtt aca aag ggt gac ate ate acg gtc gat 2003 
Tyr Gly His Thr Ala Gly Val Thr Lys Gly Asp He He Thr Val Asp 
35 40 45 

gaa gec cag acg atg ctg aca aac gat att acc gta ttt gaa egg gcg 2051 
Glu Ala Gin Thr Met Leu Thr Asn Asp He Thr Val Phe Glu Arg Ala 
50 55 60 

gtc agt cag gee gtc gcg gtt cct ctg aat cag teg caa tac gat gec 2099 
Val Seir Gin Ala Val Ala Val Pro Leu Asn Gin Ser Gin Tyr Asp Ala 
65 70 75 80 

ctg gtt tct ttg gtt ttt aat att ggc cag ggg aat ttt aaa cgc tct 2147 
Leu Val Ser Leu Val Phe Asn He Gly Gin Gly Asn Phe Lys Arg Ser 
85 90 95 

acc tt$ ttg aaa aaa etc aac aaa cag gac tat gtc ggc gee ggg aac 2195 
Thr Leu Leu Lys Lys Leu Asn Lys Gin Asp Tyr Val Gly Ala Gly Asn 
100 105 HO 

gag ttt tta cgc tgg acc egg gec aat ggg aag gtc ctt ccc gga ctg 2243 
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Glu Phe Leu Arg Trp Thr Arg Ala Asn Gly Lys Val Leu Pro Gly Leu 
115 120 125 

att cgc cga cgc gaa get gaa egg gtg ttg ttt gag aaa ctg ggt gca 22 91 
lie Arg Arg Arg Glu Ala Glu Arg Val Leu Phe Glu Lys Leu Gly Ala 
130 135 140 

taa ccctttgcga cgtacccaca agatgaagat aacaccgcgt actgageggt 2344 

145 

v 

ggcgcaacaa tgaataaatg actgtgtacg gcctgtcctt cacaaeggat gggaccatca 2404 

aegtaa tga atg agg caa gac att atg tat aat att gat gat att ctg 2452 
Met Arg Gin Asp lie Met Tyr Asn lie Asp Asp He Leu 
150 155 

gag aaa gtg aat get cca cga gca cgc ctg tea gaa gaa aac gat aca 2500 
Glu Lys Val Asn Ala Pro Arg Ala Arg Leu Ser Glu Glu Asn Asp Thr 
160 165 170 175 

gcg gtg acg ctg acg gat tta ttc teg cgt teg ttt ccc gag gtc aaa 254 8 
Ala Val Thr Leu Thr Asp Leu Phe Ser Arg Ser Phe Pro Glu Val Lys 
180 185 190 

aaa ate act ggc gac age ctg tea tgg gga gag gtc tgc tat ctg tac 2596 
Lys He Thr Gly~Asp Ser Leu Ser Trp Gly Glu Val Cys Tyr Leu Tyr 
195 200 205 

agt cag gcg cag cac gaa cag aaa gaa aac egg etc acc gaa tec cgt 2644 
Ser Gin Ala Gin His Glu Gin Lys Glu Asn Arg Leu Thr Glu Ser Arg 
210 215 220 

att ctg gec egg gcg aat ccc eta ctg gtg aat gec gtt cgc ctg gga 2692 
He Leu Ala Arg Ala Asn Pro Leu Leu Val Asn Ala Val Arg Leu Gly 
225 v 230 235 

at a egg cag gca gee ggc agt cgc age tat gat gac tgg ttt ggc tec 2740 
He Arg Gin Ala Ala Gly Ser Arg Ser Tyr Asp Asp Trp Phe Gly Ser 
240 245 250 255 

cgc gca gac cgt ttc gec cgc ccc ggc teg gtg gec tec atg ttc tea 27 88 
Arg Ala Asp Arg Phe Ala Arg Pro Gly Ser Val Ala Ser Met Phe Ser 
260 265 270 

ccg gcg gcg tat ctg acc gag ctg tac cgt gag gcg aag gac ctg cat 2836 
Pro Ala Ala Tyr Leu Thr Glu Leu Tyr Arg Glu Ala Lys Asp Leu His 
275 280 285 

ccg gac acc teg ctg ttc egg ctg gac ate egg cgt ccc gac ctg gcg 2 884 
Pro Asp Thr Ser Leu Phe Arg Leu Aap He Arg Arg Pro Asp Leu Ala 
290 295 300 

gcg ctg gec ctt age cag aat aat atg gac gac gag etc tec acc ctg 2 932 
Ala Leu Ala Leu Ser Gin Asn Asn Met Asp Asp Glu Leu Ser Thr Leu 
305 310 315 



age ctg tec aat gag eta ctg tat cgc ggt ate ggg gca gcg gaa ggg 



2980 



- * «' ' 1 „ : \ c_: 
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Ser Leu Ser Asn Glu Leu Leu Tyr Arg Gly lie Gly Ala Ala Glu Gly 
320 325 330 335 

ctt gae gac gac age gtc agg gag ctg etc gec ggg tat cgc ctg acc 3028 
Leu Asp Asp Asp Ser Val Arg Glu Leu Leu Ala Gly Tyr Arg Leu Thr 
340 345 350 

ggc ctg acc ccc tat cac tgg gcg tac gag gcg gec cgc caa gec att 3076 
Gly Leu Thr Pro Tyr His Trp Ala Tyr Glu Ala Ala Arg Gin Ala He 
355 360 365 

ctg gtg cag gac ccg acg ctg atg ggg ttc age cgt aat ccg gat gtg 3124 
Leu Val Gin Asp Pro Thr Leu Met Gly Phe Ser Arg Asn Pro Asp Val 
370 375 3B0 

gcg cag ctt atg gac cct gec tec atg ctg gec att gaa gec gat att 3172 
Ala Gin Leu Met Asp Pro Ala Ser Met Leu Ala lie Glu Ala Asp He 
3*5 390 395 

tea ccg gag ctg tat cag ata ctg gec gaa gaa att acg aca gac agt 3220 
Ser Pro Glu Leu Tyr Gin lie Leu Ala Glu Glu He Thr Thr Asp Ser 
400 405 410 415 

tac gaa gca etc tgg agt aag aat ttt ggt gat atg cct ccc tec tea 32 68 
Tyr Glu Ala Leu Trp Ser Lyc Asn Phe Gly Asp Met Pro Pro Ser Ser 
420 425 430 

ctg tta tct tat gat gca ctt gca aca ttt tat gat ctt gat tac gat 3 316 
Leu Leu Ser Tyr Asp Ala Leu Ala Thr Phe Tyr Asp Leu Asp Tyr Asp 
435 440 445 

gag eta act teg tta ttg tea tta agg ctg gac ttt tea aat cca aac 3 364 
Glu Leu Thr Ser Leu Leu Ser Leu Arg Leu Asp Phe Ser Asn Pro Asn 
450 455 460 

aat gaa tac tac att aat agt caa tta agt gtc gta act ctg aat gaa 3412 
Asn Glu Tyr Tyr He Asn Ser Gin Leu Ser Val Val Thr Leu Asn Glu 
465 470 475 

age act ggt tta ata act ata cat cat tat tta aga acg eta ggc gga 3460 
Ser Thr Gly Leu He Thr lie His His Tyr Leu Arg Thr Leu Gly Gly 
480 485 490 495 

v 

gac tea cag cag att aac cct gag ctt ata cct tat ggg gat gga aca 3508 
Asp Ser Gin Gin He Asn Pro Glu Leu He Pro Tyr Gly Asp Gly Thr 
500 505 510 

tat ctt tat aat ttc age gtg gtg tea acg ata tea gag gat agt ttc 3556 
Tyr Leu Tyr Asn Phe Ser Val Val Ser Thr He Ser Glu Asp Ser Phe 
515 520 525 

aaa eta ggg teg tta ggt tct aac agt age aat ctt tac tct ggg gat 3604 
Lys Leu Gly Ser Leu Gly Ser Asn Ser Ser Asn Leu Tyr Ser Gly Asp 
530 535 540 

tat cag ctt caa aaa ggg gtt cgc tat age att cct gtt gaa ata gat 3652 
Tyr Gin Leu Gin Lys Gly Val Arg Tyr Ser He Pro Val Glu He Asp 
545 550 555 
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gaa gga^aag tta aat gat ggg aXc aca ata gga ttg'agt agg aaa ggg 
Glu Gly Lys Leu Asn Asp Gly lie Thr Ile^Gly Leu Ser Arg Lys Gly 
560 565 570 575 



3700 



999 99 a tat tac tca aca 9 ta aac ttc act ct 9 att 9 aa tat gat cct 
Gly Gly Tyr Tyr Ser Thr Val Asn Phe Thr Leu lie Glu Tyr Asp Pro 
580 585 590 



3748 



gcg ata ttc att ctt aaa tta aat aaa gtt ate cgc eta tac aag gec 
Ala lie Phe lie Leu Lys Leu Asn Lys Val He Arg Leu Tyr Lys Ala 
595 600 605 



3796 



a cg ggc atg acc acg gcg gaa ata tat caa ate ace aat att ctt aat 
Thr Gly Met Thr Thr Ala Glu He Tyr Gin He Thr Asn He Leu Asn 
610 615 620 



3844 



aac ggt etc acc att gac cat gcg gtc ctg agt aaa ate ttc ctg gtc 
Asn Gly Leu Thr He Asp His Ala Val Leu Ser Lys He Phe Leu Val 
625 630 635 



3892 



cgt tac ctg atg cgt cac tat cag ctt gat gtg gec egg tea ctg ata 
Arg Tyr Leu Met Arg His Tyr Gin Leu Asp Val Ala Arg Ser Leu He 
640 645 6S0 655 



3940 



ttg tgc aac gga acc ate agt gac cag gcg ttc age ggc gaa acc ggc 
Leu Cys Asn Gly Thr He Ser Asp Gin Ala Phe Ser Gly Glu Thr Gly 
660 665 670 



3988 



ctg ttc acc acg ctg ttc aac acc cca ccg ctg aac ggc cag ctg ttt 
Leu Phe Thr Thr Leu Phe Asn Thr Pro Pro Leu Asn Gly Gin Leu Phe 
675 680 685 



4036 



tct gca gat gat acc ccc etc gac tta cgc tct gaa gca ccg gag gat 
Ser Ala Asp Asp Thr Pro Leu Asp Leu Arg Ser Glu Ala Pro Glu Asp 
690 695 700 



4084 



get ttc cgt etc age gta ctg aaa cgc gca ttt aac ate age gee teg 
Ala Phe Arg Leu Ser Val Leu Lys Arg Ala Phe Asn He Ser Ala Ser 
705 710 715 



4132 



ggg ctt tec acg etc tgg cag ttg gec age ggt gac age age get ggg 
Gly Leu Ser Thr Leu Trp Gin Leu Ala Ser Gly Asp Ser Ser Ala Gly 
720 725 730 735 



4180 



ttt age tgc tct get gac aat ate gec gca etc tac cga gtg aaa etc 
Phe Ser Cys Ser Ala Asp Asn lie Ala Ala Leu Tyr Arg Val Lys Leu 
740 745 750 



4228 



ctg get gac ate eac gac eta tec get ggt gag ctg tea atg ttg ctg 
Leu Ala Asp He His Asp Leu Ser Ala Gly Glu Leu Ser Met Leu Leu 
755 760 765 



4276 



tec gtc tec cct ttc age ggg gtg gec gec ggc teg ctg tec gat aat 
Ser Val Ser Pro Phe Ser Gly Val Ala Ala Gly Ser Leu Ser Asp Asn 
770 775 780 



4324 



gag ctg acg cag ttt ctg tac cag acc acc acc tgg etc acg gag cag 4 372 
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Glu Leu Thr Gin Phe Leu Tyr Gin Thr Thr Thr Trp Leu Thr Glu Gin 
785 790 795 



ggc tgg acg gtc age gat gtg ttc ctg atg ctg acg acg cag tac ggt 
Gly Trp Thr Val Ser Asp Val Phe Leu Met Leu Thr Thr Gin Tyr Gly 
800 805 810 815 



4420 



acc ctg ctg act ccc gac att gag aac ctg etc get tec ctg cgc aac 
Thr Leu v Leu Thr Pro Asp lie Glu Asn Leu Leu Ala Ser Leu Arg Asn 
820 825 830 



4468 



gga ctg teg ggc cgt gag ctg ttc ccg gaa acg etc ccc ggc gat ggc 4 516 

Gly Leu Ser Gly Arg Glu Leu Phe Pro Glu Thr Leu Pro Gly Asp Gly 
, 835 840 845 

get ccc ttt att gec gee gee atg cag ctg gac gec acg gat acg gcg 4564 

Ala Pro Phe lie Ala Ala Ala Met Gin Leu Asp Ala Thr Asp Thr Ala 

850 855 860 



aag gcg atg ctg act tgg gcg gac cag ttg aag cca gag ggg ctg acg 
Lys Ala Met Leu Thr Trp Ala Asp Gin Leu Lys Pro Glu Gly Leu Thr 
865 870 875 



4612 



ctg acg gaa ttt att ctt ttg gtg atg aat gec gec cca aat gac gag 
Leu Thr Glu Phe lie Leu Leu Val Met Asn Ala Ala Pro Asn Asp Glu 
880 885 890 895 



4660 



cag gcg ggc cag atg gca ggg ttc tgc caa gec ctg tgg caa ctg gca 4 708 
Gin Ala Gly Gin Met Ala Gly Phe Cys Oln Ala Leu Trp Gin Leu Ala 
900 905 910 

ctg ate ate cgc age acc ggc etc age acg cgc gag ctg acg ctg ctg 4756 
Leu He He Arg Ser Thr Gly Leu Ser Thr Arg Glu Leu Thr Leu Leu 
* 915 920 925 

gtc age cag ccg gga cgc ttc cgc aca gga tgg cac cat ctg ccc cat 4804 
Val Ser Gin Pro Gly Arg Phe Arg Thr Gly Trp His His Leu Pro His 
930 935 940 

gac ctg ccg gcg ctt cgc gac att acg cgt ttt cat gec gtc gtt aac 4852 
Asp Leu Pro Ala Leu Arg Asp He Thr Arg Phe His Ala Val Val Asn 
945 950 955 

cgc age ggc age cat gec ggg gag gtc ctg acc gca ctt gag acc gga 4900 
Arg Ser Gly Ser His Ala Gly Glu Val Leu Thr Ala Leu Glu Thr Gly 
960 965 970 975 

gaa ctg teg tea gee ctg ctg gec egg gec ctg tea cag aat gag cag 4948 
Glu Leu Ser Ser Ala Leu Leu Ala Arg Ala Leu Ser Gin Asn Glu Gin 
980 98S 990 



gat gtg acc ggc gee ttg gcg cag gtg agg ggg gee ggt gaa cag gac 
Asp Val Thr Gly Ala Leu Ala Gin Val Arg Gly Ala Gly Glu Gin Asp 
995 1000 1005 



4996 



aac age gtg ttc acc tec tgg gaa gag gtg gac cag get gag cag tgg 
Asn Ser Val Phe Thr Ser Trp Glu Glu Val Asp Gin Ala Glu Gin Trp 
1010 1015 1020 



5044 
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ctg gac atg agt gag acc ctg tec att acg cca tec ggt ctg get age 

Leu Asp Met Ser Glu Thr Leu Ser lie Thr Pro Ser Gly Leu Ala Ser 

1025 1030 1035 

ctg att gec ctg aag tac ate aat gtg tec gat gac agt gca ccg ttg 

Leu lie Ala Leu Lys Tyr lie Asn Val Ser Asp Asp Ser Ala Pro Leu 

1040 1045 1050 1055 



tac age cag tgg cag gtg gta tec ggt ctg ctg cag gee ggg ctg aaa 

Tyr Ser Gin Trp Gin Val Val Ser Gly Leu Leu Gin Ala Gly Leu Lys 
1060 1065 1070 

age age cag age teg gcg ctg cac gat tat ctg gag gag ggg acc age 

Ser Ser Gin Ser ser Ala Leu His Asp Tyr Leu Glu Glu Gly Thr Ser 
1075 1080 10B5 



age gec ctt tgt gcg tat tat ctg cgt aat ctg gca ccg aac atg gta 
Ser Ala Leu Cys Ala Tyr Tyr Leu Arg Asn Leu Ala Pro Asn Met Val 
1090 1095 1100 

tec ggg cgc gat gac etc ttc ggg tat ctg ctg ctg gat aat cag gtg 
Ser Gly Arg Asp Asp Leu Phe Gly Tyr Leu Leu Leu Asp Asn Gin Val 
1105 1110 1115 



tea gec aag gta aaa acc acc cgc att gcg gag gec ate gec ggc ata 
Ser Ala Lys Val Lys Thr Thr Arg lie Ala Glu Ala lie Ala Gly lie 
1120 112S 1130 1135 



egg ctg tat ate aac egg gec ctt aac gga ata gaa etc age gec atg 
Arg Leu Tyr lie Asn Arg Ala Leu Asn Gly lie Glu Leu Ser Ala Met 
1140 1145 1150 

gca gag gtg agg ggg cgt cag ttt ttc act gac tgg gat acg ttc aac 
Ala Glu Val Arg Gly Arg Gin Phe Phe Thr Asp Trp Asp Thr Phe Asn 
1155 1160 1165 



aaa cgt tac age acc tgg gcg ggc gtc tea gag ctg gtt tac tat ccg 
Lys Arg Tyr Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr Tyr Pro 
1170 1175 1180 

gaa aavc tac etc gac ccg acg gtc cgt ate ggg cag acc ggc atg atg 
Glu Asn Tyr Leu Asp Pro Thr Val Arg lie Gly Gin Thr Gly Met Met 
1185 1190 1195 



gac acc ctg ctg cag tct gtc age cag age agt ate aac cgc gat acc 
Asp Thr Leu Leu Gin Ser Val Ser Gin Ser Ser lie Asn Arg Asp Thr 
1200 1205 1210 1215 

gtg gag gat gec ttt aaa acc tat ctg acc acg ttt gag cag att gee 
Val Glu Asp Ala Phe Lys Thr Tyr Leu Thr Thr Phe Glu Gin lie Ala 
1220 1225 1230 

aat ctg aac act gtc age gga tat cac gat aac gec age atg acg cag 
Asn Leu Asn Thr Val Ser Gly Tyr His Asp Asn Ala Ser Met Thr Gin 
1235 1240 1245 



ggg act aca tgg tat gtg ggt cgc age ate aca gat cag act aac tgg 



5092 



5140 



5188 



5236 



5284 



5332 



5380 



5428 



5476 



S524 



5572 



5620 



5668 
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Gly Thr Thr Trp Tyr Val Gly Arg Ser He Thr Asp Gin Thr Asn Trp 
1250 1255 v 1260 



tac tgg cgc age gec aac cac age aaa ate caa gac tea atg atg ccc 
Tyr Trp Arg Ser Ala Asn His Ser Lys lie Gin Asp Ser Met Met Pro 
1265 1270 1275 



5812 



gcg aat gee tgg ace gga tgg aca aaa att aac tgc gga atg aat ccg 
Ala Asn Ala Trp Thr Gly Trp Thr Lys He Asn Cys Gly Met Asn Pro 
1280 1285 1290 1295 



5860 



tgg tea gat ctt gtg tgc teg gtg ttt ttc aac agt cgc ctt tat gtc 
Trp Ser Asp Leu Val Cys Ser Val Phe Phe Asn Ser Arg Leu Tyr Val 
1300 1305 1310 



5908 



gtc tgg gtc gaa gag aat cag tct get gat acg gag gca gag age acg 
Val Trp Val Glu Glu Asn Gin Ser Ala Asp Thr Glu Ala Glu Ser Thr 
1315 1320 1325 



5956 



aca acc acg cag cag age tac acg ctg aaa ctg teg ttc egg cgc tac 
Thr Thr Thr Gin Gin Ser Tyr Thr Leu Lys Leu Ser Phe Arg Arg Tyr 
1330 1335 -> . 1340- 



6004 



gac ggt aca tgg agt tec ccg gtg teg ttc gac att acc ggc aac ate 
Asp GlV Thr Trp Ser Ser Pro Val Ser Phe Asp He Thr Gly Asn He 
1345 1350 1355 



6052 



gca ttt ccg gaa acg cag ggc atg cat gtg acc tgt aat ccc ctg act 
Ala Phe Pro Glu Thr Gin Gly Met His Val Thr Cys Asn Pro Leu Thr 
1360 1365 1370 1375 



6100 



gag cag etc tat tgc gcg ttt tac tec gtc acc age aag ccg gac ttt 
Glu Gin Leu Tyr Cys Ala Phe Tyr Ser Val Thr Ser Lys Pro Asp Phe 
1380 1385 1390 



6148 



gat aac get cag ctg att tct gtg gat aat gat atg acg eta aat gtc 
Asp Asn Ala Gin Leu lie Ser Val Asp Asn Asp Met Thr Leu Asn Val 
1395 1400 1405 



6196 



ate tea gat ata ggg att ttt aag age gtc agt cac gaa ttt aat acg 
He Ser Asp He Gly He Phe Lys Ser Val Ser His Glu Phe Asn Thr 
1410 1415 1420 



6244 



age act gag aaa ttt att aat aat gtt ttt tea gac cct tec get aat 
Ser Thr Glu Lys Phe He Asn Asn Val Phe Ser Asp Pro Ser Ala Asn 
142S 1430 1435 



6292 



tat ttt gtc agt gca acg agt tta att gat gat gtt ate cac age gat 
Tyr Phe Val Ser Ala Thr Ser Leu He Asp Asp Val He His Ser Asp 
1440 v 144S 1450 1455 



6340 



ttc tea etc ctt aat tct aaa act aca agt act gtt ttt act aat gaa 
Phe Ser Leu Leu Asn Ser Lys Thr Thr Ser Thr Val Phe Thr Asn Glu 
1460 1465 1470 



6388 



gat tec tct ctt ttg acg cca gag ctt cat att aca gca aat gtt teg 
Asp Ser Ser Leu Leu Thr Pro Glu Leu His He Thr Ala Asn Val Ser 
1475 1480 1485 



6436 
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tgt ttt gtt agt act get ggc ate gee act caa tct acc ata gaa aaa 
Cys Phe Val Ser Thr Ala Gly lie Ala Thr Gin Ser Thr lie Glu Lys 
1490 1495 1500 



6484 



ttc gtt cag gca ggg ata gaa ttt gag gaa att aat ttt tat gca ggc 
Phe Val vGln Ala Gly lie Glu Phe Glu Glu He Asn Phe Tyr Ala Gly 
1505 1510 1515 



6532 



cag gee gec ggc gga ttt gac gga ttt gtg gga gtg gat gtt tct aat 
Gin Ala Ala Gly Gly Phe Asp Gly Phe Val Gly Val Asp Val Ser Asn 
1520 1S25 1530 1535 



6580 



tea aaa gta tac cag gtc gga aaa gaa gca gtt ggt gtc act gta aaa 
Ser Lys Val Tyr Gin Val Gly Lys Glu Ala Val Gly Val Thr Val Lys 
1540 1545 1550 



6628 



tct tat tec gtc act ggc gtt agt ggt tct gtt gag tta ttt att gat 
Ser Tyr Ser Val Thr Gly Val Ser Gly Ser Val Glu Leu Phe He Asp 
1555 1560 1565 



6676 



tea tea aat aaa tac ttc age gga att ttg tea gat aaa atg ata acc 
Ser Ser Asn Lys Tyr Phe Ser Gly He Leu Ser Asp Lys Met He Thr 
1570 1575 1580 



6724 



get tta att age ggc agt aca tea aaa gtt aat tac gtg teg tct att 
Ala Leu He Ser Gly Ser Thr Ser Lys Val Asn Tyr Val Ser Ser He 
1585 1590 1595 



6772 



ggc tct caa gat ttt tgg agt gta aag teg etc atg ccg gca ctt cag 
Gly Ser Gin Asp Phe Trp Ser Val Lys Ser Leu Met Pro Ala Leu Gin 
1600 v 1605 1610 1615 



6820 



ata tat gaa tta ate gat gat ate ata ctg aca tec ggc gta aat ggg 
lie Tyr Glu Leu He Asp Asp He He Leu Thr Ser Gly Val Asn Gly 
1620 1625 1630 



6868 



act gaa att aaa tec tgg cct tec get gaa tgg tat aat gat aag ctg 
Thr Glu He Lys Ser Trp Pro Ser Ala Glu Trp Tyr Asn Asp Lys Leu 
1635 1640 1645 



6916 



agt ctg caa tec ggg aat aat ctt ttc aac acc aaa teg ctg agt ttt 
Ser Leu Gin Ser Gly Asn Asn Leu Phe Asn Thr Lys Ser Leu Ser Phe 
1650 1655 1660 



6964 



acc gtt aat acc agt gat att gtt gaa gat gag ttt gac gtg acg ttt 
Thr Val Asn Thr Ser Asp He Val Glu Asp Glu Phe Asp Val Thr Phe 
1665 1670 1675 



7012 



acg ttc acc get gtc gat cag aat aac gtc gtg ctg gec gee egg acg 
Thr Phe Thr Ala Val Asp Gin Asn Asn Val Val Leu Ala Ala Arg Thr 
1680 1685 1690 1695 



7060 



gee ata tta acc gtc att cga aac att aat aat gac act tec gtt ate 
Ala He Leu Thr Val He Arg Asn He Asn Asn Asp Thr Ser Val He 
1700 1705 1710 



7108 



gca tta cgt aaa aat acg cgt ggc gcg cag tat att cgt ttc act gcg 



7156 
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Ala Leu Arg Lys Asn Thr Arg Gly Ala Gin Tyr He Arg Phe Thr Ala 
1715 1720 1725 



ggt aac gat gtg gcg ctt att cgc etc aac acc etc ttt gec cgc caa 
Gly Asn Asp Val Ala Leu lie Arg Leu Asn Thr Leu Phe Ala Arg Gin 
1730 1735 1740 



7204 



ctg gtc gac egg gcg aat acc ggg att gac acc att ctt tec atg gag 
Leu Val Asp Arg Ala Asn Thr Gly lie Asp Thr He Leu Ser Met Glu 
1745 1750 1755 



7252 



acc cag agg ctt acc gaa ccc gee ctg gaa gag ggg agt gat gtg ttt 
Thr Gin Arg Leu Thr Glu Pro Ala Leu Glu Glu Gly Ser Asp Val Phe 
1760 176S 1770 1775 



7300 



atg gac ttc tec gga gec aat gee etc tat ttc tgg gag ctg ttc tat 
Met Asp Phe Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe Tyr 
1780 1785 1790 



7348 



tac acg ccg atg atg gtg ttc cag egg ttg ttg cag gaa cag cac ttc 
Tyr Thr Pro Met Met Val Phe Gin Arg Leu Leu Gin Glu Gin His Phe 
1795 1800 1805 



7396 



ccg gaa gec acc cgc tgg ctg cag tat gtc tgg aac ccg gee ggg cac 
Pro Glu Ala Thr Arg Trp Leu Gin Tyr Val Trp Asn Pro Ala Gly His 
1810 1815 1820 



7444 



gtg gta aac ggg gtg ctg cag aat tac acc tgg aat gtc cgt ccg ctg 
Val Val Asn Gly Val Leu Gin Asn Tyr Thr Trp Asn Val Arg Pro Leu 
1825 1830 1835 



7492 



gag gag gac acc ggc tgg aac gac teg ccg ctg gac tec att gac ccc 
Glu Glu Asp Thr Gly Trp Asn Asp Ser Pro Leu Asp Ser He Asp Pro 
1840 1845 1850 1855 



7540 



gat gca ata gec cag tac gac ccc atg cat tac aag gtc gec acc ttt 
Asp Ala lie Ala Gin Tyr Asp Pro Met His Tyr Lys Val Ala Thr Phe 
1860 1865 1870 



7588 



atg tc^ tac etc gac ctg ctg att gee cgc ggt gat gee gec tac egg 
Met Ser Tyr Leu Asp Leu Leu He Ala Arg Gly Asp Ala Ala Tyr Arg 
1875 1880 1885 



7636 



ctg etc gag egg gac acc ctt aac gag gee egg atg tgg tac gtc cag 
Leu Leu Glu Arg Asp Thr Leu Asn Glu Ala Arg Met Trp Tyr Val Gin 
1890 1895 1900 



7684 



gee ctg aac ctt ctg ggc gac gag ccc tat att tec ttt gac gee gac 
Ala Leu Asn Leu Leu Gly Asp Glu Pro Tyr He Ser Phe Asp Ala Asp 
1905 1910 1915 



7732 



tgg teg gcg ttg acc ctg ggt gac gca gee age gag gtg acg cga cgc 
Trp Ser Ala Leu Thr Leu Gly Asp Ala Ala Ser Glu Val Thr Arg Arg 
1920 1925 1930 1935 



7780 



gat tac cag gag gec ctg ctg gee gtg cgc egg ttg gtg ccc get ccc 
Asp Tyr Gin Glu Ala Leu Leu Ala Val Arg Arg Leu Val Pro Ala Pro 
1940 1945 1950 



7828 



WO 01/16305 



12 



PCT/NZOO/00174 



gag aca egg acg gcg aat tec ctg acg gca ctg ttc etc ccg cag cag 7 876 
Glu Thr Arg Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Gin 
1955 1960 1965 

aac gag gtg etc aaa ggc tac tgg caa ace ttg gca cag egg etc cat 7 924 
Asn Glu Val Leu Lys Gly Tyr Trp Gin Thr Leu Ala Gin Arg Leu His 
1970 1975 1980 



aac ctg cgc cac aac etc tec att gac ggc cag ccg ctt tec ctg tec .7972 
Asn Leu Arg His Asn Leu Ser lie Asp Gly Gin Pro Leu Ser Leu Ser 
1985 1990 1995 



gtc tac gec acg ccg tec gaa ccg tec gee ctg cag agt gec gtc gtc 802 0 
Val Tyr Ala Thr Pro Ser Glu Pro Ser Ala Leu Gin Ser Ala Val Val 
2000 2005 2010 2015 

aac age gcg cag ggt get gca gca ctg ccg gec gcg gcg atg ccg ctt 8068 
Asn Ser Ala Gin Gly Ala Ala Ala Leu Pro Ala Ala Val Met Pro Leu 
2020 2025 2030 

tac agt ttc ccg gtc atg ctg gag aac gec egg ggg atg gtg age ctg 8116 
Tyr Ser Vhe Pro Val Met Leu Glu Asn Ala Arg Gly Met Val Ser Leu 
2035 2040 2045 



ctg acc ggg ttc ggc aac aca ctg etc ggt att ace gag cgt cag gat 8164 
Leu Thr Gly Phe Gly Asn Thr Leu Leu Gly He Thr Glu Arg Gin Asp 
2050 2055 2060 

gcg gag gcg ctg gec aaa ctg ctg cag acc cag ggc agt gaa ctg at a 8212 
Ala Glu Ala Leu Ala Lys Leu Leu Gin Thr Gin Gly Ser Glu Leu He 
2065 2070 2075 



cgc cag ggc ctt cgc cag cag gat 
Arg Gin Gly Leu Arg Gin Gin Asp 
2080 2085 

gat att gec gec ctg gag gag age 
Asp He Ala Ala Leu Glu Glu Ser 
2100 



aac gtc cte gag gaa ate gat gcg 8260 
Asn Val Leu Glu Glu He Asp Ala 
2090 2095 

cgc cgc ggc gcg cag atg cgt ttt 8308 
Arg Arg Gly Ala Gin Met Arg Phe 
2105 2110 



gaa cgt tac aaa gtg ttg tac gag gcg gac gtc aac acc ggc gaa aaa 8356 
Glu Arg Tyr Lys Val Leu Tyr Glu Ala Asp Val Asn Thr Gly Glu Lys 
2115 2120 2125 

cag gee atg gac ttg tac etc agt teg tec gtg ctg teg gca tea acc 8404 
Gin Ala Met Asp Leu Tyr Leu Ser Ser Ser Val Leu Ser Ala Ser Thr 
^130 2135 2140 



gec gcg etc ttt ttg gee gag gec gcg gec gat atg ctg ccc aat att 8452 
Ala Ala Leu Phe Leu Ala Glu Ala Ala Ala Asp Met Leu Pro Asn He 
2145 2150 2155 

tac ggg ctg gec gtc ggg ggc tec cgc tat ggg gca eta ttt aaa gee 8500 
Tyr Gly Leu Ala Val Gly Gly Ser Arg Tyr Gly Ala Leu Phe Lys Ala 
2160 2165 2170 2175 



acc gec ate ggc ate cag gtg tec tec gat gec acc cgc ata tea gcg 
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Thr Ala lie Gly lie Gin Val Ser Ser Asp Ala Thr Arg He Ser Ala 
2180 2185 2190 



gac aaa ate age cag teg gaa gtg tac cgc cgt cgc egg gag gag tgg 
Asp Lys He Ser Gin Ser Glu Val Tyr Arg Arg Arg Arg Glu Glu Trp 
2195 2200 2205 



8596 



gaa ate <?ag cgt gat agt gcg cag tct gac gtg gcg cag at-t gat gee 
Glu He Gin Arg Asp Ser Ala Gin Ser Asp Val Ala Gin He Asp Ala 
2210 2215 2220 



8644 



cag ctg gcg gec atg gca gtg cgc egg gaa ggg get gag ctg cag aaa 
Gin Leu Ala Ala Met Ala Val Arg Arg Glu Gly Ala Glu Leu Gin Lys 
2225 . 2230 2235 



8692 



act tac ctt gag acc cag cag acc cag gca cag gcg cag ttg gca ttc 
Thr Tyr Leu Glu Thr Gin Gin Thr Gin Ala Gin Ala Gin Leu Ala Phe 
2240 2245 2250 225S 



8740 



ctg cag agt aag ttc aac aat acg get ctg tac age tgg ctg egg ggc 
Leu Gin Ser Lys Phe Asn Asn Thr Ala Leu Tyr Ser Trp Leu Arg Gly 
2260 2265 2270 



8788 



agg ttg tec gec att tat tac cag ttc tat gac ctg gca gta tec cgc 
Arg Leu Ser Ala He Tyr Tyr Gin Phe Tyr Asp Leu Ala Val Ser Arg 
2275 2280 2285 



8836 



tgc ctg atg gcg caa cag gee tgg cag tgg gat aaa ttc gag act agg 
Cys Leu Met Ala Gin Gin Ala Trp Gin Trp Asp Lys Phe Glu Thr Arg 
2290 2295 2300 



6884 



teg ttt ate cag ccg ggg gec tgg atg ggg gca aat gec ggt ctg ctg 
Ser Phe ^Ile' Gin Pro Gly Ala Trp Met Gly Ala Asn Ala Gly Leu Leu 
2305 2310 2315 



8932 



gec ggg gaa acc ctg atg ctg aat ctg gcg cag atg gag cag gee tgg 
Ala Gly Glu Thr Leu Met Leu Asn Leu Ala Gin Met Glu Gin Ala Trp 
2320 2325 2330 2335 



8980 



ctg acg ggg gat gag egg gca ata gag gtg acg egg acg gtc tgc ctg 
Leu Thr Gly Asp Glu Arg Ala He Glu Val Thr Arg Thr Val Cys Leu 
2340 2345 2350 



9028 



teg gag gtc tat acc age etc gcg gag gat gcg gca ttc tct ctg gec 
Ser Glu Val Tyr Thr Ser Leu Ala Glu Asp Ala Ala Phe Ser Leu Ala 
2355 2360 2365 



9076 



gac aag gtg gtg gaa ctg gtc agt aac ggt teg ggc agt gcg ggt acg 
Asp Lys Val Val Glu Leu Val Ser Asn Gly Ser Gly Ser Ala Gly Thr 
2370 2375 2380 



9124 



aaa age aac gga tta cag atg gat caa cag caa etc gag gee acc ctg 
Lys Ser Asn Gly Leu Gin Met Asp Gin Gin Gin Leu Glu Ala Thr Leu 
2385 2390 2395 



9172 



aaa ctg get gac etc ggt ate ggc aac gat tac ccg gtc tec ctt ggc 
Lys Leu Ala Asp Leu Gly He Gly Asn Asp Tyr Pro Val Ser Leu Gly 
2400 2405 2410 2415 
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acc atg agg cgc ate aaa caa ata age gtc acg etc ccg gcg ctg gtc 
Thr Met Arg Arg He Lys Gin He Ser Val Thr Leu Pro Ala Leu Val 
2420 2425 2430 



9268 



ggc ccc tat cag gac gtc cgt gcg gtt etc age tac ggc gga agt atg 
Gly Pro Tyr Gin Asp Val Arg Ala Val Leu Ser Tyr Gly Gly Ser Met 
2435 2440 2445 



9316 



gtc atg ccc egg ggt tgc age gcg ctg gcg gtc tea cac gga atg aac 
Val Met Pro Arg Gly Cys Ser Ala Leu Ala Val Ser His Gly Met Asn 
2450 2455 2460 



9364 



gac age ggc caa ttc caa ctg gat ttc aat gac ccg cgt tac ctg ccg 
Asp Ser Gly Gin Phe Gin Leu Asp Phe Asn Asp Pro Arg Tyr Leu Pro 
2465 S 2470 2475 



9412 



ttt gaa gga ctt cca gtt gat gac aca ggg acc ctg aca ctg age ttc 
Phe Glu Gly Leu Pro Val Asp Asp Thr Gly Thr Leu Thr Leu Ser Phe 
2480 2485 2490 2495 



9460 



ccg gat get gac ggc aaa caa cag gcg atg etc etc agt ctg age gac 
Pro Asp Ala Asp Gly Lys Gin Gin Ala Met Leu Leu Ser Leu Ser Asp 
2500 2505 2510 



9508 



ate ate ctg cat ate cgt tac acc att ate age tga tag gtatcaacat 
He He Leu His He Arg Tyr Thr He He Ser 
2515 2520 



9557 



agcgcaggcc cccgaacgag ggectgegag gagactgagc atg caa aat cat caa 

Met Gin Asn His Gin 
2525 



9612 



gac atg gee att act gee ccc acg ttg cct tec ggg ggc ggt gcg gtc 
Asp Met Ala He Thr Ala Pro Thr Leu Pro Ser Gly Gly Gly Ala Val 
2530 2535 2540 2545 



9660 



acc ggg etc aag ggt gat ate gcg gcg gca ggg ccg gat ggt gcg gcg 
Thr Gly Leu Lys Gly Asp He Ala Ala Ala Gly Pro Asp Gly Ala Ala 
2550 2555 2560 



9708 



acc ctg agt att ccc ttg ccg gtt age ccc ggt egg ggt tac gec ccc 
Thr Leu Ser He Pro Leu Pro Val Ser Pro Gly Arg Gly Tyr Ala Pro 
2565 2570 2575 



9756 



act ggg gca ctt aat tat cac age egg teg ggg aac ggc ccc ttt ggc 
Thr Gly Ala Leu Asn Tyr His Ser Arg Ser Gly Asn Gly Pro Phe Gly 
2580 2585 2590 



9804 



att ggc tgg ggt ate ggc ggt get get gtc cag cgt cgt acg cgc aac 
He Gly Trp Gly He Gly Gly Ala Ala Val Gin Arg Arg Thr Arg Asn 
2595 2600 2605 



9852 



gga gca cct acc tac gat gat act gat gaa ttc acc ggt ccg gac ggt 
Gly Ala Pro Thr Tyr Asp Asp Thr Asp Glu Phe Thr Gly Pro Asp Gly 
2610 2615 2620 2625 



9900 



gag gtg ctg gtg ccg gca etc acg get get ggc acc caa gaa gca egg 994 8 
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Glu Val Leu Val Pro Ala Leu Thr Ala Ala Gly Thr Gin Glu Ala Arg 
2630 2635 2640 



cag gcc acc tea eta ctg ggg ata aac cca ggc gga age ttc aac gtt 
Gin Ala Thr Ser Leu Leu Gly lie Asn Pro Gly Gly Ser Phe Asn Val 
2645 2650 2655 



9996 



cag gtt tac cgt tea cgt acg gag ggt agt etc age cgc ctt gag cgt 
Gin Val Tyr Arg Ser Arg Thr Glu Gly Ser Leu Ser Arg Leu Glu Arg 
2660 2665 2670 



10044 



tgg ctg ccc gcc gac gag aca gaa acg gaa ttt tgg gtg tta tat acc 
Trp Leu Pro Ala Asp Glu Thr Glu Thr Glu Phe Trp Val Leu Tyr Thr 
2675 2680 2685 



10092 



cct gac gga cag gtg get ctg ctg ggc cga aat gcg cag get cgc ate 
Pro Asp Gly Gin Val Ala Leu Leu Gly Arg Asn Ala Gin Ala Arg lie 
2690 2695 2700 2705 



10140 



age aac ccc aca gcc cca aca caj "'acg gcg gtt tgg ctg atg gag tec 
Ser Asn Pro Thr Ala Pro Thr Gin Thr Ala Val Trp Leu Met Glu Ser 
2710 2715 2720 



10188 



teg gta tea ctt acc ggc gaa cag atg tat tac caa tac cgt gcg gaa 
Ser Val Ser Leu Thr Gly Glu Gin Met Tyr Tyr Gin Tyr Arg Ala Glu 
2725 2730 2735 



10236 



gat gat gac ggt tgt gac gag gcg gag cgc gac gcg cac ccg cag gcc 
Asp Asp Asp Gly Cys Asp Glu Ala Glu Arg Asp Ala His Pro Gin Ala 
2740 2745 2750 



10284 



ggc gcc caa cgt tat ccg gtg gcg gtc tgg tat ggt aac cgt cag gcg 
Gly Ala Gin Arg Tyr Pro Val Ala Val Trp Tyr Gly Asn Arg Gin Ala 
2755 2760 2765 



10332 



get egg acg eta ccg gcg ctg gtg teg aca cca tea atg gat age tgg 
Ala Arg Thr Leu Pro Ala Leu Val Ser Thr Pro Ser Met Asp Ser Trp 
2770 2775 2780 2785 



10380 



ctg ttt ate ctg gtg ttt gat tat ggt gag cgt age teg gtg ctg tct 
Leu Phe lie Leu Val Phe Asp Tyr Gly . Glu Arg Ser Ser Val Leu Ser 
2790 2795 2800 



10428 



gaa gcg ccg gcc tgg caa aca cca gga agt ggg gag tgg ctg tgt cgt 
Glu Ala Pro Ala Trp Gin Thr Pro Gly Ser Gly Glu Trp Leu Cys Arg 
2805 2810 2815 



10476 



cag gat tgt ttt tec ggg tat gag ttt ggt ttt aac ctg egg act cgc 
Gin Asp Cys Phe Ser Gly Tyr Glu Phe Gly Phe Asn Leu Arg Thr Arg 
2820 2825 2830 



10524 



cgc ctg tgc cgt cag gtt ttg atg ttc cat tac eta ggt gtt ctg gcg 
Arg Leu Cys Arg Gin Val Leu Met Phe His Tyr Leu Gly Val Leu Ala 
2835 2840 2845 



10572 



ggg agt teg gga gcg aat gat gcg cca gca ttg att tct cgc ctg ttg 
Gly Ser Ser Gly Ala Asn Asp Ala Pro Ala Leu lie Ser Arg Leu Leu 
2850 2855 2860 2865 



10620 
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ctg gac tac agg gaa agt cct tea etc agt ctg etc gag aac gtg cac 
Leu Asp Tyr Arg Glu Ser Pro Ser Leu Ser Leu Leu Glu Asn Val His 
2870 2875 2880 

cag gtg get tat gag teg gac ggg acg tct tgt gee ttg ccg gca ctg 
Gin Val Ala Tyr Glu Ser Asp Gly Thr Ser Cys Ala Leu Pro Ala Leu 
2885 2890 2895 



gca ttg ggg tgg caa acc ttt acc ccg ccg aca ttg teg gca tgg cag 
Ala Leu Gly Trp Gin Thr Phe Thr Pro Pro Thr Leu Ser Ala Trp Gin 
2900 2905 2910 

acg cgt gac gat atg ggc aag ttg agt ttg ctt caa ccc tat cag ctt 
Thr Arg Asp Asp Met Gly Lys Leu Ser Leu Leu Gin Pro Tyr Gin Leu 
2915 2920 2925 

gta gac ctt aac ggc gaa ggfr gtg gtg ggt ate ctg tat cag gac age 
Val Asp Leu Asn Gly Glu Gly Val Val Gly lie Leu Tyr Gin Asp Ser 
2930 2935 2940 2945 

ggt gec tgg tgg tac cgt gaa ccg gta cgc cag teg ggg gat gat ccg 
Gly Ala Trp Trp Tyr Arg Glu Pro Val Arg Gin Ser Gly Asp Asp Pro 
2950 2955 2960 

gat get gtg acc tgg ggg gcg get gcg gec ctg ccg aca atg ccc get 
Asp Ala Val Thr Trp Gly Ala Ala Ala Ala Leu Pro Thr Met Pro Ala 
2965 2970 2975 

ttg cat aac age ggc ate ctg gcg gat ctt aat ggg gat ggt egg ctg 

Leu His Asn Ser Gly He Leu Ala Asp Leu Asn Gly Asp Gly Arg Leu 

2980 2985 2990 

v 

ga/g tgg gtc gtt acc gec ccc ggt gtg gcg ggg atg tat gat cgc acc 

Glu Trp Val Val Thr Ala Pro Gly Val Ala Gly Met Tyr Asp Arg Thr 

2995 3000 3005 

ccc ggc cgc gac tgg ttg cat ttc acc ccc ctg tea gee ttg ccc gta 
Pro Gly Arg Asp Trp Leu His Phe Thr Pro Leu Ser Ala Leu Pro Val 
3010 3015 3020 3025 

gaa tat gcg cat cca aaa gca gtg etc gec gat ate ctg ggg get ggg 
Glu Tyr Ala His Pro Lys Ala Val Leu Ala Asp He Leu Gly Ala Gly 
3030 3035 3040 

tta acg gac atg gtg ctt ate ggg ccg cgc agt gtt cgc etc tat tec 
Leu Thr Asp Met Val Leu He Gly Pro Arg Ser Val Arg Leu Tyr Ser 
3045 3050 3055 

ggc aaa aac gat ggt tgg aat aaa ggg gag acc gtg cag caa acg gaa 
Gly Lys Asn Asp Gly Trp Asn Lys Gly Glu Thr Val Gin Gin Thr Glu 
3060 3065 3070 

aga etc act ctg ccg gtc ccg ggg gtt gac cca cgt acc etc gtg gcg 
Arg Leu Thr Leu Pro Val Pro Gly Val Asp Pro Arg Thr Leu Val Ala 
3075 3080 3085 

ttc a£t gat atg get ggc agt gga cag cag cat ttg acg gag gtg cgt 



10668 



10716 



10764 



10812 



10860 



10908 



10956 



11004 



11052 



11100 



11148 



11196 



11244 



11292 
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Phe Ser Asp Met Ala Gly Ser Gly Gin Gin His Leu Thr Glu Val Arg 
3090 3095 3100 3105 



get aat gga gta cgt tac tgg cca aac ctg ggg cac ggt cgt ttc ggt 
Ala Asn Gly Val Arg Tyr Trp Pro Asn Leu Gly His Gly Arg Phe Gly 
3110 311S 3120 



11388 



cag ccg gtg aat att ccc ggt ttt age cag tea gtg act acg ttt aac 
Gin Pro Val Asn lie Pro Gly Phe Ser Gin Ser Val Thr Thr Phe Asn 
3125 3130 3135 



11436 



cct gac cag ata ttg ctg gec gat acc gac ggt tec ggt acc acg gac 
Pro Asp Gin He Leu Leu Ala Asp Thr Asp Gly Ser Gly Thr Thr Asp 
3140 3145 3150 



11484 



ctg att tat gcg atg agt gac egg tta gtc att tat ttc aac cag agt 
Leu He Tyr Ala Met Ser Asp Arg Leu Val He Tyr Phe Asn Gin Ser 
3155 3160 3165 



11532 



ggt aat tat ttc gec gag ccg cat acg ctg etc ttg ccg aaa ggt gtg 
Gly Asn Tyr Phe Ala Glu Pro His Thr Leu Leu Leu Pro Lys Gly Val 
3170 3175 3180 3185 



11580 



cgc tat gat cgc acc tgc agt ctg caa gtg gcg gat ate cag ggg ctg 
Arg Tyr v Asp Arg Thr Cys Ser Leu Gin Val Ala Asp lie Gin Gly Leu 
3190 3195 3200 



11628 



ggg gtg cct age ctg tta ctg acg gtc ccc cat gtc gcg cct cat cac 
Gly Val Pro Ser Leu Leu Leu Thr Val Pro His Val Ala Pro His His 
3205 3210 3215 



11676 



tgg gtg tgc cat tta teg gca gac aaa ccc tgg ttg ttg aat ggc atg 
Trp Val Cys His Leu Ser Ala Asp Lys Pro Trp Leu Leu Asn Gly Met 
3220 3225 3230 



11724 



aac aac aat atg ggg gec egg cat gca ctg cac tat cgc agt teg gtg 
Asn Asn Asn Met Gly Ala Arg His Ala Leu His Tyr Arg Ser Ser Val 
3235 3240 3245 



11772 



cag ttc tgg ctg gat gag aaa gec gag gca ctg gcg gca ggc agt tec 
Gin Phe Trp Leu Asp Glu Lys Ala Glu Ala Leu Ala Ala Gly Ser Ser 
3250 3255 3260 3265 



11820 



cct gec tgc tac ctg cca ttt aca ttg cat acc ctg tgg cgt teg gtg 
Pro Ala Cys Tyr Leu Pro Phe Thr Leu His Thr Leu Trp Arg Ser Val 
3270 3275 3280 



11868 



gtg cag gat gag ate acc ggt aac cgt ctg gtc age gac gtg ctt tat 
Val Gin Asp Glu He Thr Gly Asn Arg Leu Val Ser Asp Val Leu Tyr 
3285 3290 3295 



11916 



cgc cac ggc gtc tgg gac ggg cag gaa cgc gag ttt egg ggg ttt ggt 
Arg His Gly Val Trp Asp Gly Gin Glu Arg Glu Phe Arg Gly Phe Gly 
3300 3305 3310 



11964 



ttt gtt gag ate agg gat acc gat acc ttg gca age cag ggt acg gcg 
Phe Val Glu He Arg Asp Thr Asp Thr Leu Ala Ser Gin Gly Thr Ala 
3315 3320 3325 



12012 
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acg gaa ctg agt atg cct tct gtg age egg aac tgg tat gee ace ggg 
Thr Glu Leu Ser Met Pro Ser Val Ser Arg Asn Trp Tyr Ala Thr Gly 
3330 3335 3340 3345 



12060 



gta ccg gca gta gac gag cgt ctg ccg gag acg tat tgg caa aac gat 
Val Pro Ala Val Asp Glu Arg. Leu Pro Glu Thr Tyr Trp Gin Asn Asp 
3350 3355 3360 



12108 



gee gec get ttt gec gat ttc gcg acc cgt ttc act gtc ggt tea gga 
Ala Ala Ala Phe Ala Asp Phe Ala Thr Arg Phe Thr Val Gly Ser Gly 
3365 3370 3375 



12156 



gag gat *gag cag aca tat act ccg gac gac age aag aca ttc tgg ttg 
Glu Asp Glu Gin Thr Tyr Thr Pro Asp Asp Ser Lys Thr Phe Trp Leu 
3380 3385 3390 



12204 



cag cga gec ctg aaa ggc ate ctg ctg cgc agt gag tta tac ggt gec 
Gin Arg Ala Leu Lys Gly He Leu Leu Arg Ser Glu Leu Tyr Gly Ala 
3395 3400 3405 



12252 



gat ggc age age cag gec gat ate cct tac age gtc act gag tct cgc 
Asp Gly Ser Ser Gin Ala Asp He Pro Tyr Ser Val Thr Glu Ser Arg 
3410 3415 3420 3425 



12300 



ccg cag gta egg eta gtt gaa gcg aat gga gac tac ccg gtg gtg tgg 
Pro Gin Val Arg Leu Val Glu Ala Asn Gly Asp Tyr Pro Val Val Trp 
3430 3435 3440 



12348 



ccg atg ggc gcg gaa age cgt acg tea gtt tat gaa egg tac cac aat 
Pro Met Gly Ala Glu Ser Arg Thr Ser Val Tyr Glu Arg Tyr His Asn 
3445 3450 3455 



12396 



gat cct caa tgc caa cag cag gcg gta etc etc agt gat gaa tac ggt 
Asp Pro Gin Cys Gin Gin Gin Ala Val Leu Leu Ser Asp Glu Tyr Gly 
3460 3465 3470 



12444 



ttc cca ctg cgt cag gtc agt gtc aat tat cca cga cgc cct ccg teg 
Phe Pro\ Leu Arg Gin Val Ser Val Asn Tyr Pro Arg Arg Pro Pro Ser 
3475 3480 3485 



12492 



gcg gac aat cca tat ccg gcg tec tta ccg gcg acg ctg ttc gec aac 
Ala Asp Asn Pro Tyr Pro Ala Ser Leu Pro Ala Thr Leu Phe Ala Asn 
3490 3495 3500 3505 



12540 



agt tat gac gag cag cag cag ata tta cgc ctg ggg ttg caa cag age 
Ser Tyr Asp Glu Gin Gin Gin He Leu Arg Leu Gly Leu Gin Gin Ser 
3510 3S15 3520 



12588 



agt gca cat cac ctt gtt tea ctg tct gag ggg cat tgg ttg ttg ggg 
Ser Ala His His Leu Val Ser Leu Ser Glu Gly His Trp Leu Leu Gly 
3525 3530 3535 



12636 



ttg gcg gag gcg teg egg gac gat gta ttc acg tac tct gcg gac aac 
Leu Ala Glu Ala Ser Arg Asp Asp Val Phe Thr Tyr Ser Ala Asp Asn 
3540 3545 3550 



12684 



gtg ccg gaa ggg ggt ctg acg ctg gaa cac ctg ttg gcg ccc gaa age 



12732 
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val Pro Glu Gly Gly Leu Thr Leu Glu His Leu Leu Ala Pro Glu Ser 
3555 3560 3565 

ctg gtc teg gat agt cag gtc ggt acg ctg gcg ggt cag cag eaa gtc 
Leu Val Ser Asp Ser Gin Val Gly Thr Leu Ala Gly Gin Gin Gin Val 
3570 3575 3580 3585 

tgg tat ctg gat tea caa gac gtt gec acc gtc get get ccg cca etc 
Trp Tyr Leu Asp Ser Gin Asp Val Ala Thr Val Ala Ala Pro Pro Leu 
3590 3595 3600 

ccc ccc aag gta get ttt ate gaa acg gee gtg ctg gat gag ggt atg 
Pro Pro Lys Val Ala Phe He Glu Thr Ala Val Leu Asp Glu Gly Met 
3605 3610 3615 

gtc agt tea ctg get gec tac att gtg gat gaa cat etc gag caa gee 
Val Ser Ser Leu Ala Ala Tyr He Val Asp Glu His Leu Glu Gin Ala 
3620 3625 3630 

ggt tac egg caa tec gga tac ctt ttc cct cga ggc agg gaa gca gaa 
Gly Tyr v Arg Gin Ser Gly Tyr Leu Phe Pro Arg Gly Arg Glu Ala Glu 
3635 3640 3645 

cag gca ttg tgg acc cag tgt cag gga tat gtt acc tat gec ggc gca 
Gin Ala Leu Trp Thr Gin Cys Gin Gly Tyr Val Thr Tyr Ala Gly AJ.a 
3650 3655 3660 3665 



12780 



12828 



12876 



12924 



12972 



13020 



gag cat ttc tgg eta ccg eta tec ttt egg gac agt atg ttg acc ggc 
Glu His Phe Trp Leu Pro Leu Ser Phe Arg Asp Ser Met Leu Thr Gly 
3670 3675 3680 



13068 



cca gtt acc gtg acg cgt gac gcg tac gac tgc gtc ate acg cag tgg 
Pro Val Thr Val. Thr Arg Asp Ala Tyr Asp Cys Val He Thr Gin Trp 
3685 3690 3695 



13116 



cag gat gee gca ggg att gtc acc aca gec gac tat gac tgg cgc ttc 
Gin Asp Ala Ala Gly He Val Thr Thr Ala Asp Tyr Asp Trp Arg Phe 
3700 3705 3710 



13164 



ctg acg ccc gtc egg gtg acg gac ccc aat gat aat ctg cag tec gtc 
Leu Thr Pro Val Arg Val Thr Asp Pro As n Asp Asn Leu Gin Ser Val 
3715 3720 3725 

act ctg gat get ctg ggc egg gtg acc acc ctg cga ttc tgg ggc acg 
Thr Leu Asp Ala Leu Gly Arg Val Thr Thr Leu Arg Phe Trp Gly Thr 
3730 v 3735 3740 3745 

gag aat ggt att gee acc ggt tac agt gat gec acg ttg tec gtt ccg 
Glu Asn Gly He Ala Thr Gly Tyr Ser Asp Ala Thr Leu Ser Val Pro 
3750 3755 3760 



13212 



13260 



13308 



gac ggc gca gca gee get ctg gcg ttg acg gcg ccc eta cca gta gca 
Asp Gly Ala Ala Ala Ala Leu Ala Leu Thr Ala Pro Leu Pro Val Ala 
3765 3770 3775 



13356 



cag tgt ctg gtg tat gtc acg gac agt tgg gga gat gac gac aat gag 
Gin Cys Leu Val Tyr Val Thr Asp Ser Trp Gly Asp Asp Asp Asn Glu 
3780 3785 3790 



13404 
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aaa atg ccc ccg cac gtg gtc gtg ctg get acc gat cgc tat gac agt 13452 
Lys Met Pro Pro His Val Val Val Leu Ala Thr Asp Arg Tyr Asp Ser 
3795 3800 3805 

gat acc gga cag cag gtc cgc caa cag gtg aca ttc agt gac ggt ttt 13500 
Asp Thr Gly Gin Gin Val Arg Gin Gin Val Thr Phe Ser Asp Gly Phe 
3810 3815 3820 3825 

ggg cgt gag ttg caa teg gca acc egg cag gee gag ggc aac gec tgg 1354 8 
Gly Arg Glu Leu Gin Ser Ala Thr Arg Gin Ala Glu Gly Asn Ala Trp 
3830 , 3835 3840 

caa cga gga cgc gac ggc aaa ctg gtg acg gec agt gac gga ttg ccg 13 5 96 
Gin Arg Gly Arg Asp Gly Lys Leu Val Thr Ala Ser Asp Gly Leu Pro 
3845 3850 3855 

gtc act gta gca acg aat ttc cgc tgg gcg gtc acc ggg agg gcg gag 13 644 
Val Thr Val Ala Thr Asn Phe Arg Trp Ala Val Thr Gly Arg Ala Glu 
3860 3865 3870 

tat gac aat aaa ggt ctg cct gtt egg gtt tat cag ccg tat ttt ctg 13 692 
Tyr Asp Asn Lys Gly Leu Pro Val Arg Val Tyr Gin Pro Tyr Phe Leu 
3875 3880 3885 

gac agt tgg caa tat gtc agt gat gac agt gec cgc cag gac ctg tat 1374 0 
Asp Ser Trp Gin Tyr Val Ser Asp Asp Ser Ala Arg Gin Asp Leu Tyr 
3890 K 3895 3900 3905 

gec gac acg cac ttt tac gat ccg acg gca egg gaa tgg cag gtt att 1378 8 
Ala Asp Thr His Phe Tyr Asp Pro Thr Ala Arg Glu Trp Gin Val lie 
3910 3915 3920 

acg gca aaa ggt gaa egg cga cag gtg ctg tat acc ccg tgg ttt gtg 13836 
Thr Ala Lys Gly Glu Arg Arg Gin Val Leu Tyr Thr Pro Trp Phe Val 
3925 3930 3935 

gtc agt gaa gac gag aat gat acc gtt ggg eta aac gac gca tec tga 13884 
Val Ser Glu Asp Glu Asn Asp Thr Val Gly Leu Asn Asp Ala Ser 
3940 3945 3950 

ctgggaagga gggggggacg gtg atg agt ccg teg ccc ctg aca ggc get gec 13937 

Met Ser Pro Ser Pro Leu Thr Gly Ala Ala 
3955 3960 

ctg atg gag aca aag atg aaa ata\ac tat cag gtt gcg gcg gtt gtg 13985 
Leu Met Glu Thr Lys Met Lys He His Tyr Gin Val Ala Ala Val Val 
3965 3970 3975 

ctg aca ggt gtt atg gtt tgg ggg ctt tec cat tgg cgt tac acc gtc 1403 3 
Leu Thr Gly Val Met Val Trp Gly Leu Ser His Trp Arg Tyr Thr Val 
3980 3985 3990 3995 

ggt tac cac gcg gca gat act caa tgg caa caa cgc cag gee gaa cag 14081 
Gly Tyr His Ala Ala Asp Thr Gin Trp Gin Gin Arg Gin Ala Glu Gin 
4000 4005 4010 

gaa agg gec gat gcg ttg gec etc ctg gca gca gaa acc egg gaa aga 1412 9 
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Glu Arg Ala Asp Ala Leu Ala Leu Leu Ala Ala Glu Thr Arg Glu Arg 
4015 4020 4025 



aag tgg gag cag caa cga cag act gac atg aac aag gtg get ata cat 
Lys Trp Glu Gin Gin Arg Gin Thr Asp Met Asn Lys Val Ala lie His 
4030 4035 4040 



14177 



get gaa gaa gaa ctg get get gcg cgt gac get gee get gat get cag 
Ala Glu Glu Glu Leu Ala Ala Ala Arg Asp Ala Ala Ala Asp Ala Gin 
4045 4050 4055 



14225 



cgc act ggt cag cgc ctg cag cac ace gtt acc acc etc cag egg caa 
Arg Thr Gly Gin Arg Leu Gin His Thr Val Thr Thr Leu Gin Arg Gin 
4060 4065 4070 4075 



14273 



ctt gec agt cgt gaa acc cgc cgc ctt tec gca get acc get ate ggt 
Leu Ala Ser Arg Glu Thr Arg Arg Leu Ser Ala Ala Thr Ala lie Gly 
4080 4085 4090 



14321 



aca gac gac etc gga ggc caa ccc ggc gtt ttg ttt gec gaa ctg ttc 
Thr Asp Asp Leu Gly Gly Gin Pro Gly Val Leu Phe Ala Glu Leu Phe 
4095 4100 4105 



14369 



cgc cgc get gac cag aga gcg gga gag ctg gca gcg tat get gac agg 
Arg Arg Ala Asp Gin Axg Ala Gly Glu Leu Ala Ala Tyr Ala Asp Arg 
4110 4115 4120 



14417 



acc aga gtg aaa tgg cag gee tgc ggg cgc gec tat cag gcg get acg 
Thr Arg Val Lys Trp Gin Ala Cys Gly Arg Ala Tyr Gin Ala Ala Thr 
4125 4130 4135 



14465 



cac gaa gca gaa aaa taa ggegatttag ccgttaagga aaagtgacgg 
His Glu Ala Glu Lys 
4140 4145 



14513 



tgttttcgcg attaatatta acaggagatc ac atg age aca tec ttg ttc agt 

Met Ser Thr Ser Leu Phe Ser 
4150 



14566 



age acc ccg teg gtc gcg gtg etc gac aac cgc ggc ctg ttg gtg egg 
Ser Thr Pro Ser Val Ala Val Leu Asp Asn Arg Gly Leu Leu Val Arg 
4155 4160 4165 



14614 



gag ctg cag tac tac cgc cat ccg gat aca ccg gag gag acg gac gag 
Glu Leu Gin Tyr Tyr Arg His Pro Asp Thr Pro Glu Glu Thr Asp Glu 
4170 4175 4180 



14662 



cgt ate acc tgc cat cag cac gat gag cgc ggc age ttg tea caa age 
Arg lie Thr Cys His Gin His Asp Glu Arg Gly Ser Leu Ser Gin Ser 
4185 4190 4195 4200 



14710 



gec gac ccg egg tta cac gcg gee ggt ctg aca aat ttc acg tac ctg 
Ala Asp Pro Arg Leu His Ala Ala Gly Leu Thr Asn Phe Thr Tyr Leu 
4205 4210 4215 



14758 



aat age ctg acc ggg aca gta ctg cag age gtc age gec gat gec ggt 
Asn Ser Leu Thr Gly Thr Val Leu Gin Ser Val Ser Ala Asp Ala Gly 
4220 4225 4230 



14806 



::L O £h 7" Q - l-Ei 9 - O '*"'- 2, « - 



WO 01/16305 



PCT/NZ00/00174 



22 



acg teg ctg gaa ctg age gat gec gec ggg egg gcg ttt ctg gee gtc 
Thr Ser Leu Glu Leu Ser Asp Ala Ala Gly Arg Ala Phe Leu Ala Val 
4235 4240 4245 



14854 



ace ggg get ggg acg gaa gac gcg gtc ace cgc acc tgg caa tat gaa 
Thr Gly Ala Gly Thr Glu Asp Ala Val Thr Arg Thr Trp Gin Tyr Glu 
4250 4255 4260 



14902 



gac gat acc ctg ccg ggc cgc ccg ctg age ate acc gag cag gtt acc 
Asp Asp Thr Leu Pro Gly Arg Pro Leu Ser lie Thr Glu Gin Val Thr 
4265 4270 4275 4280 



14950 



ggt gaa gec gec caa att acg gaa cgc ttc gtg tac get ggc aat acg 
Gly Glu Ala Ala Gin lie Thr Glu Arg Phe Val Tyr Ala Gly Asn Thr 
4285 4290 4295 



14998 



gat gec £ag aag att etc aat ctg get ggc cag tgt gtc agt cat tac 
Asp Ala Glu Lys lie Leu Asn Leu Ala Gly Gin Cye Val Ser His Tyr 
4300 4305 4310 



15046 



gat acc gec gga ctg gtg cag acg gac age ate gec ctg age ggc gtg 
Asp Thr Ala Gly Leu Val Gin Thr Asp Ser He Ala Leu Ser Gly Val 
4315 4320 4325 



15094 



ccg etc gec gtc acg egg cag ttg ctg ccc gac gcg gcg ggg gec aac 
Pro Leu Ala Val Thr Arg Gin Leu Leu Pro Asp Ala Ala Gly Ala Asn 
4330 4335 4340 



15142 



tgg atg ggt gag gat gec. teg gec tgg aat gac ctg ctg gat ggg gag 
Trp Met Gly Glu Asp Ala Ser Ala Trp Asn Asp Leu Leu Asp Gly Glu 
4345 4350 4355 4360 



15190 



acg ttc ttc acc cag acc cac get gat gcg acc ggc gec gtc ctg age 
Thr Phe Phe Thr Gin Thr His Ala Asp Ala Thr Gly Ala Val Leu Ser 
4365 4370 4375 



15238 



ate acc gat gca aaa ggt aat ctg cag cgt gtg gca tat gat gtg get 
He Thr Asp Ala Lys Gly Asn Leu Gin Arg Val Ala Tyr Asp Val Ala 
4380 4385 4390 



15286 



ggg ctg eta teg ggc agt tgg ttg acg ctg aag gac ggc acg gag cag 
Gly Leu *Leu Ser Gly Ser Trp Leu Thr Leu Lys Asp Gly Thr Glu Gin 
4395 4400 4405 



15334 



gtc ate gtg gec tec ctg acg tac teg gec gec ggg aaa aag ttg cgt 
Val He Val Ala Ser Leu Thr Tyr Ser Ala Ala Gly Lys Lys Leu Arg 
4410 4415 4420 



15382 



gaa gaa cac ggc aac ggc g'cg gta acc teg tat att tac gag ccg gaa 
Glu Glu His Gly Asn Gly Val Val Thr Ser Tyr He Tyr Glu Pro Glu 
4425 4430 4435 4440 



15430 



aca cag cgc ctg acg ggg att aaa acg gaa cgt ccg tct ggg cac gtt 
Thr Gin Arg Leu Thr Gly He Lys Thr Glu Arg Pro Ser Gly His Val 
4445 4450 4455 



15478 



gec gga gca aaa gtg ctg cag gac ctg cgc tat acg tat gac ccg gta 15526 
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Ala Gly Ala Lys Val Leu Gin Asp Leu Arg Tyr Thr Tyr Asp Pro Val 
4460 4465 4470 



ggc aac gta etc age gtc aat aac gat gcg gaa gag ace cgc ttc tgg 
Gly Asn Val Leu Ser Val Asn Asn Asp Ala Glu Glu Thr Arg Phe Trp 
4475 4480 4485 



15574 



cgt aac cag aaa gtg gta ccg gag aat acg tac ate tac gac age ctg 
Arg Asn Gin Lys Val Val Pro Glu Asn Thr Tyr lie Tyr Asp Ser Leu 
4490 4495 4500 



15622 



tac cag ctg gtc age gec aca ggg cgt gag atg gec aat gec ggc cag 
Tyr Gin Leu Val Ser Ala Thr Gly Arg Glu Met Ala Asn Ala Gly Gin 
4505 4510 4515 4520 



15670 



cag ggc aac gac tta cca tec get aca gec ccc ctt cct aca gac age 
Gin Gly Asn Asp Leu Pro Ser Ala Thr Ala Pro Leu Pro Thr Asp Ser 
4525 4530 4535 



15718 



tct gec tac acc aat tac acg cgc acc tac cgt tat gac cgt ggt ggc 
Ser Ala Tyr Thr Asn Tyr Thr Arg Thr Tyr Arg Tyr Asp Arg Gly Gly 
4540 4545 4550 



15766 



aac ctg acg cag atg cgc cac agt gec cct gec acg aac aat aat tat 
Asn Leu v Thr Gin Met Arg His Ser Ala Pro Ala Thr Asn Asn Asn Tyr 
4555 4560 4565 



15814 



acg aca gac ate acg gtt agt gac cgc age aat agg gcg gta ctg age 
Thr Thr Asp He Thr Val Ser Asp Arg Ser Asn Arg Ala Val Leu Ser 
4570 4575 4580 



15862 



acg ttg gcg gaa gtg ccg tea gat gtt gat atg ctg ttc agt gca gga 
Thr Leu Ala Glu Val Pro Ser Asp Val Asp Met Leu Phe Ser Ala Gly 
4585 4590 4595 4600 



15910 



ggt cac cag aag cac ctg cag ccg ggg caa gca ctg gtg tgg acg cca 
Gly His Gin Lys His Leu Gin Pro Gly Gin Ala Leu Val Trp Thr Pro 
4605 4610 - 4615 



15958 



cgt gga gaa ctg caa aag gtg aca ccg gtg gtg cgt gat ggg ggg gcg 
Arg Gly Glu Leu Gin Lys Val Thr Pro Val Val Arg Asp Gly Gly Ala 
4620 4625 4630 



16006 



gac gac age gaa age tat egg tat gat gcg ggc agt cag cgt att ate 
Asp Asp Ser Glu Ser Tyr Arg Tyr Asp Ala Gly Ser Gin Arg He lie 
4635 4640 4645 



16054 



aaa acc ggc acg egg caa act ggc aac aac gtt cag aca cag egg gta 
Lys Thr Gly Thr Arg Gin Thr Gly Asn Asn Val Gin Thr Gin Arg Val 
4650 k 4655 4660 



16102 



gtg tac ctg ccg ggg ctg gag tta cgt ate atg gca aat ggc gtg acg 
Val Tyr Leu Pro Gly Leu Glu Leu Arg He Met Ala Asn Gly Val Thr 
4665 4670 4675 4680 



16150 



gaa aaa gaa age ctg cag gtt att acg gtg ggc gag get ggg egg gca 
Glu Lys Glu Ser Leu Gin Val He Thr Val Gly Glu Ala Gly Arg Ala 
4685 4690 4695 



16198 



:i o a 7" -oi h - s ? g ,,. a : 3 ;:i '7 s p-£i 



WO 01/16305 



PCT/NZOO/00174 



24 



caa gtg cgc gta ttg cac tgg gag ate ggc aag ccg gat gac etc gat 
Gin Val Arg Val Leu His Trp Glu lie Gly Lys Pro Asp Asp Leu Asp 
4700 4705 4710 

gag gac fcg gtg cgt tac agt tac gat aac ctg gtg ggc age age cag 
Glu Asp Ser Val Arg Tyr Ser Tyr Asp Asn Leu Val Gly Ser Ser Gin 
4715 4720 4725 

ctg gag ctg gac aga gag ggt tac ctt ate agt gag gag gag ttc tac 
Leu Glu Leu Asp Arg Glu Gly Tyr Leu He Ser Glu Glu Glu Phe Tyr 
4730 4735 4740 

ccg tat ggc gga acg get gtt ctg acg gcg cga agt gag gtt gag get 
Pro Tyr Gly Gly Thr Ala Val Leu Thr Ala Arg Ser Glu Val Glu Ala 
4745 4750 4755 4760 

gac tac aaa act ate cga tac tea ggc aag gag cgt gac gcg acg ggg 
Asp Tyr Lys Thr lie Arg Tyr Ser Gly Lys Glu Arg Asp Ala Thr Gly 
4765 4770 4775 

ctg gat tat tac ggt tat egg tat tac cag cca tgg gca ggg cgc tgg 
Leu Asp Tyr Tyr Gly Tyr Arg Tyr Tyr Gin Pro Trp Ala Gly Arg Trp 
4780 4785 4790 

etc tec acg gac ccg gca ggc acg gtg gac ggg ctg aac ctg ttc cgc 
Leu Ser Thr Asp Pro Ala Gly Thr Val Asp Gly Leu Asn Leu Phe Arg 
4795 4800 4805 

atg gtg egg aat aat ccc gtc acg ctg ttt gac age aac ggg egg ate 
Met Val Arg Asn Asn Pro Val Thr Leu Phe Asp Ser Asn Gly Arg He 
4810 V 4815 4820 

agt act ggt cag gag gec aga cga tta gtg ggg gaa gca ttt gtt cat 
Ser Thr Gly Gin Glu Ala Arg Arg Leu Val Gly Glu Ala Phe Val His 
4825 * 4830 4835 4840 

ccg tta cac atg cct gtt ttt gaa aga att tct gta gag aga aag att 
Pro Leu His Met Pro Val Phe Glu Arg He Ser Val Glu Arg Lys He 
4845 4850 4855 

tea atg age gta agg gaa get ggc att tat act att tea gcg ctg ggt 
Ser Met Ser Val Arg Glu Ala Gly He Tyr Thr He Ser Ala Leu Gly 
4860 4865 4870 



16246 



16294 



16342 



16390 



16438 



16486 



16534 



16582 



16630 



16678 



16726 



gaa ggt gca 
Glu Gly Ala 
4875 

ccc ggt tec 
Pro Gly Ser 
4890 

gga ctg gca 
Gly Leu Ala 
4905 



gca gca aaa ggc cat 
Ala Ala Lys Gly His 
4880 

ctg aag get ate tat 
Leu Lys Ala He Tyr 
4895 

aaa cgt age ggt etc 
Lys Arg Ser Gly Leu 
4910 



aat att eta gag aaa acc att aaa 
Asn He Leu Glu Lys Thr He Lys 
4885 

ggt gat aaa get gag tea att ctt 
Gly Asp Lys Ala Glu Ser He Leu 
4900 

gtt ggc cga gta gga cag tgg gat 
Val G}y Arg Val Gly Gin Trp Asp 
4915 4920 



16774 



16822 



16870 



gca tea,, .ggt gta cgt gga att tat gcg cac aac aga ccg ggt ggt gag 16918 



A Oi O '7 ! pi : 'MKS Mr -'v ^ y! Q JL ? pi 2 



WO 01/16305 PCT/NZ00/00174 

25 



Ala Ser Gly Val Arg Gly He Tyr Ala His Asn Arg Pro Gly Gly Glu 
4925 4930 4935 

gat ttg gtt tat cct gtc age ctg cag aat act tct gec aat gaa att 
Asp Leu Val Tyr Pro Val Ser Leu Gin Asn Thr Ser Ala Asn Glu lie 
4940 4945 4950 



16966 



gtt, aat gca tgg ata aaa ttt aaa ate ate acg ccc tac acc ggg gat 
Val Asn Ala Trp He Lys Phe Lys He He Thr Pro Tyr Thr Gly Asp 
4955 4960 4965 



17014 



tat gac atg cac gat att att aaa ttc tct gat ggg aaa ggg cat gtg 17062 
Tyr Asp Met His Asp He lie Lys Phe Ser Asp Gly Lys Gly His Val 
4970 v 4975 4980 

cct aca gcg gaa agt agt gag gaa aga gga gta aaa gat eta att aat 
Pro Thr Ala Glu Ser Ser Glu Glu Arg Gly Val Lys Asp Leu lie Asn 
4985 4990 4995 5000 

aaa ggt gtt gcg gag gtc gat cct tec aga ccc ttt gag tat aca gcg 
Lys Gly Val Ala Glu Val Asp Pro Ser Arg Pro Phe Glu Tyr Thr Ala 
5005 5010 5015 

atg aat gtt att cgc cat gga cca cag gtg aac ttt gtt ccc tat atg 
Met Asn Val He Arg His Gly Pro Gin Val Asn Phe Val Pro Tyr Met 
5020 5025 5030 

tgg gaa cat gag cac gat aaa gtc gtt aat gat aat ggt tat ctg ggg 
Trp Glu His Glu His Asp Lys Val Val Asn Asp Asn Gly Tyr Leu Gly 
5035 5040 5045 

gtg gta get age ccg ggg ccg ttc ccg gta gcg atg gta cat cag ggg 
Val Val Ala Ser Pro Gly Pro Phe Pro Val Ala Met Val His Gin Gly 
5050 5055 5060 

gaa tgg act gtt ttt gac aac agt gaa gaa ctg ttt aat ttc tat aaa 
Glu Trp Thr Val Phe Asp Asn Ser Glu Glu Leu Phe Asn Phe Tyr Lys 
5065 5070 5075 5080 

tct aca aat aca cct ctt cct gaa cac tgg tec caa gat ttt atg gac 
Ser Thr Asn Thr Pro Leu Pro Glu His Trp Ser Gin Asp Phe Met Asp 
5085 5090 5095 

aga ggg aaa gga ata gtc gca act cct egg cat get gaa ctt ctt gat 
Arg Gly Lys Gly He Val Ala Thr Pro Arg His Ala Glu Leu Leu Asp 
5100 5105 5110 

aaa cga cga gtc atg tac taa tegtaacgat ttcctgcctt acccaaagta 
Lys Arg Arg Val Met Tyr 
5115 

tacagcccgg tgagacattt tctctgtctc atttgggttg tttttgtctc atetgeatgt 17557 
tatgtcttcc ctcatctaaa gtctaacgag acatttttag caaaatggca etttaeggtt 17617 
atgttcgcgt ttcaaccgac ggtceggatt ttactctgta aatacagaca cttcgcgcag 17677 
cctgctgcga aattatccgt gcgaaaaaag ccagcggcag cagcegggat ggacgaaatg 1773 7 



17110 



17158 



17206 



17254 



17302 



17350 



17398 



17446 



17497 



v 
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aactgcagct tctgctggct tttttgcggc caggcaacat gctgatggtt acgtgagttg 17797 
atcggctgcc accaaaaagt ccggagcgtg cggcccagat cgccgcaata atactgctgt 17B57 
atggtatttc catcaccact gtatatcgca cactctgggc cttccagaaa ccccataccg 17917 
cacaccggtg tgatcgctgg aagccccggg cattaccgcc gtctgtactc gaacactatt 17977 
gtggacttga tggttaggag attgaatcga ccatttttga gatccctaac catagatcgt 18037 
agagttgcac actcccagat ggcgtggctt agcgagcgat tatgcttaaa aattcatgtt 18097 
ttgctgtgtt tttaatccaa aacctgcttt tcaggcgcac ttatccagct acggggtctg 18157 
aagccatcgt ttttttgccg tacgatgtag cctgtcagag agcatttttg tggcgtgctc 18217 
gcccgctacg gtaccggcgg caaaacgcag ccggcctttg cagaggatgc actggtacgg 18277 
atcggtgccc aggaagcctt tcatcagcac cgcgaacccg ggccgtttcg gtttctcccg 18337 
taccgtcatc tccagcgcgt cgtaaacctt cggcagcagc gtgcccgttt gcggttggcc 18397 
agaaaaccat agtaacgcac cattttaaaa tgccgtgcag ggatatggct gacgtaacgc 18457 
tgcagcatct cctcctggct gattttctgg cgtttgtgct gctgcgtacg gtgatcgtaa 18517 
tactgatgca ccacggcccc gccgcggtag tggcgtagct gagaagccgc caccggcggg 18577 
cgcttcaggt accgggccag gtatttcacg ctgcgccagg cgccgcgggt ctttttggca 18637 
aaattcactt tccaggggcg gcggtattgc gcatgcaggg tcttcgttgc ggatatggcc 18697 
gagacccggc agggcgccag gattgatgcg cagcaggtga acgacggcat tgcgccagat 18757 
ggcttccacc tctttctttt taaagaacag ctgccgccag acgtggtgtt tgacgtcaag 18817 
accgccgcgg gtaacggaga cgtggatatg cggatgttga ttgagctgcc ggccttaggt 18877 
gtggagcgcg caaaaaatgc cggcctcgat gccctgccgg cgtgcccagc ggagcatggc 1893 7 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 144 amino acid residues 
\ (B) TYPE: amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: PROTEIN (ORF 1) 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Lys lie Ser Ser Arg Gly lie Ala Leu lie Lys Glu Phe Glu Gly 
15 10 15 

Leu Arg Leu His Ala Tyr Arg Cys Ala Ala Asp Val Trp Thr Val Gly 
20 25 30 

Tyr Gly His Thr Ala Gly Val Thr Lys Gly Asp lie lie Thr Val Asp 
35 40 45 

Glu Ala Gin Thr Met Leu Thr Asn Asp lie Thr Val Phe Glu Arg Ala 
50 55 60 

Val Ser Gin Ala Val Ala Val Pro Leu Asn Gin Ser Gin Tyr Asp Ala 
65 70 75 80 

Leu Val Ser Leu Val Phe Asn lie Gly Gin Gly Asn Phe Lys Arg Ser 
85 90 95 

Thr LeuvLeu Lys Lys Leu Asn Lys Gin Asp Tyr Val Gly Ala Gly Asn 
100 105 110 

Glu Phe Leu Arg Trp Thr Arg Ala Asn Gly Lys Val Leu Pro Gly Leu 
115 120 125 

He Arg Arg Arg Glu Ala Glu Arg Val Leu Phe Glu Lys Leu Gly 
130 135 140 

Ala 
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(2) INFORMATION FOR SEQ ID NO: 3: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: PROTEIN (ORF 2) 

(ix^ SEQUENCE DESCRIPTION: SEQ ID NO : 3: 

Met Ser Pro Ser Pro Leu Thr Gly Ala Ala Leu Met Glu Thr Lys Met 

15 10 15 

Lys lie His Tyr Gin Val Ala Ala Val Val Leu Thr Gly Val Met Val 
20 25 30 

Trp Gly Leu Ser His Trp Arg Tyr Thr Val Gly Tyr His Ala Ala Asp 
35 40 45 

Thr Gin Trp Gin Gin Arg Gin Ala Glu Gin Glu Arg Ala Asp Ala Leu 
SO 55 60 

Ala Leu Leu Ala Ala Glu Thr Arg Glu Arg Lys Trp Glu Gin Gin Arg 
65 70 75 B0 

Gin Thr Asp Met Asn Lys Val Ala lie His Ala Glu Glu Glu Leu Ala 
85 90 95 

Ala Ala Arg Asp Ala Ala Ala Asp Ala Gin Arg Thr Gly Gin Arg Leu 
100 105 110 

Gin H\Q Thr Val Thr Thr Leu Gin Arg Gin Leu Ala Ser Arg Glu Thr 
115 120 125 

Arg Arg Leu Ser Ala Ala Thr Ala lie Gly Thr Asp Asp Leu Gly Gly 
130 135 140 

Gin Pro Gly Val Leu Phe Ala Glu Leu Phe Arg Arg Ala Asp Gin Arg 
14S 150 155 160 

Ala Gly Glu Leu Ala Ala Tyr Ala Asp Arg Thr Arg Val Lys Trp Gin 
165 170 175 



Ala Cys Gly Arg Ala Tyr Gin Ala Ala Thr His Glu Ala Glu Lys 
180 185 190 
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(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 76 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: PROTEIN (SepA) 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



Met Arg Gin Asp lie Met Tyr Asn He Asp Asp He Leu Glu Lys Val 
1 5 10 IS 



Asn Ala Pro Arg Ala Arg Leu Ser 



Leu .Thr Asp Leu Phe Ser Arg Ser 
35 40 

Gly Asp Ser Leu Ser Trp Gly Glu 
50 55 

Gin His Glu Gin Lys Glu Asn Arg 
65 70 

Arg Ala Asn Pro Leu Leu Val Asn 
85 

Ala Ala Gly Ser Arg Ser Tyr Asp 
100 

Arg Phe Ala Arg Pro Gly Ser Val 
115' 120 



Glu Glu Asn Asp Thr Ala Val Thr 
25 30 

Phe Pro Glu Val Lys Lys He Thr 
45 

Val Cys Tyr Leu Tyr Ser Gin Ala 
60 

Leu Thr Glu Ser Arg He Leu Ala 
75 80 

Ala Val Arg Leu Gly He Arg Gin 
90 95 

Asp Trp Phe Gly Ser Arg Ala Asp 
105 110 

Ala Ser Met Phe Ser Pro Ala Ala 
125 



Tyr Leu Thr Glu Leu Tyr Arg Glu 
130 135 

Ser Leu Phe Arg Leu Asp He Arg 
145 150 

Leu Ser Gin Asn Asn Met Asp Asp 
165 



Ala Lys Asp Leu His Pro Asp Thr 
140 

Arg Pro Asp Leu Ala Ala Leu Ala 
155 160 

Glu Leu Ser Thr Leu Ser Leu Ser 
170 175 



Asn Glu Leu Leu Tyr Arg Gly He Gly Ala Ala Glu Gly Leu Asp Asp 
180 185 190 
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Asp Ser Val Arg Glu Leu Leu Ala Gly Tyr Arg Leu Thr Gly Leu Thr 
v 195 200 205 

Pro Tyr His Trp Ala Tyr Glu Ala Ala Arg Gin Ala lie Leu Val Gin 
210 215 220 

Asp Pro Thr Leu Met Gly Phe Ser Arg Asn Pro Asp Val Ala Gin Leu 
225 230 235 240 

Met Asp Pro Ala Ser Met Leu Ala lie Glu Ala Asp He Ser Pro Glu 
245 250 255 

Leu Tyr Gin He Leu Ala Glu Glu He Thr Thr Asp Ser Tyr Glu Ala 
260 265 270 

Leu Trp Ser Lys Asn Phe Gly Asp Met Pro Pro Ser Ser Leu Leu Ser 
275 260 285 

Tyr Asp Ala Leu Ala Thr Phe Tyr Asp Leu Asp Tyr Asp Glu Leu Thr 
290 295 300 

Ser Leu Leu Ser Leu Arg Leu Asp Phe Ser Asn Pro Asn Asn Glu Tyr 
3 °5 310 315 320 

Tyr He Asn Ser Gin Leu Ser Val Val Thr Leu Asn Glu Ser Thr Gly 
325 330 335 

v 

Leu He Thr He His His Tyr Leu Arg Thr Leu Gly Gly Asp Ser Gin 
340 345 350 

Gin He Asn Pro Glu Leu He Pro Tyr Gly Asp Gly Thr Tyr Leu Tyr 
355 360 365 

Asn Phe Ser Val Val Ser Thr He Ser Glu Asp Ser Phe I,ys Leu Gly 
370 375 380 

Ser Leu Gly Ser Asn Ser Ser Asn Leu Tyr Ser Gly Asp Tyr Gin Leu 
385 390 395 ~ 400 

Gin Lys Gly Val Arg Tyr Ser He Pro Val Glu He Asp Glu Gly Lys 
405 410 415 

Leu Asn Asp Gly He Thr He Gly Leu Ser Arg Lys Gly Gly Gly Tyr 
420 425 430 

Tyr Ser Thr Val Asn Phe Thr Leu He Glu Tyr Asp Pro Ala He Phe 
435 440 445 

He Leu Lys Leu Asn Lys Val He Arg Leu Tyr Lys Ala Thr Gly Met 
450 455 460 

Thr Thr Ala Glu He Tyr Gin He Thr Asn He Leu Asn Asn Gly Leu 
465 470 475 480 

Thr He Asp His Ala Val Leu Ser Lys He Phe Leu Val Arg Tyr Leu 
485 490 495 



Met Arg His Tyr Gin 



Leu Asp Val Ala Arg 



Ser. Leu lie Leu Cys Asn 
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500 505 510 

Gly Thr lie Ser Asp Gin Ala Phe Ser Gly Glu Thr Gly Leu Phe Thr 
515 520 525 

Thr Leu Phe Asn Thr Pro Pro Leu Asn Gly Gin Leu Phe Ser Ala Asp 
530 535 540 

Asp Thr Pro Leu Asp Leu Arg Ser Glu Ala Pro Glu Asp Ala Phe Arg 
545 * 550 555 560 

Leu Ser Val Leu Lys Arg Ala Phe Asn lie Ser Ala Ser Gly Leu Ser 
565 570 575 

Thr Leu Trp Gin Leu Ala Ser Gly Asp Ser Ser Ala Gly Phe Ser Cys 
580 585 590 

Ser Ala Asp Asn He Ala Ala Leu Tyr Arg Val Lys Leu Leu Ala Asp 
595 600 605 

He His Asp Leu Ser Ala Gly Glu Leu Ser Met Leu Leu Ser Val Ser 
610 615 620 

Pro Phe Ser Gly Val Ala Ala Gly Ser Leu Ser Asp Asn Glu Leu Thr 
625 630 635 640 

Gin Phe Leu Tyr Gin Thr Thr Thr Trp Leu Thr Glu Gin Gly Trp Thr 
645 650 655 

Val Ser Asp Val Phe Leu Met Leu Thr Thr Gin Tyr Gly Thr Leu Leu 
660 665 670 

Thr Pro Asp lie Glu Asn Leu Leu Ala Ser Leu Arg Asn Gly Leu Ser 
675 680 685 

v 

Gly Arg Glu Leu Phe Pro Glu Thr Leu Pro Gly Asp Gly Ala Pro Phe 
690 695 700 

lie Ala Ala Ala Met Gin Leu Asp Ala Thr Asp Thr Ala Lys Ala Met 
705 710 715 720 

Leu Thr Trp Ala Asp Gin Leu Lys Pro Glu Gly Leu Thr Leu Thr Glu 
725 730 735 

Phe lie Leu Leu Val Met Asn Ala Ala Pro Asn Asp Glu Gin Ala Gly 
740 745 750 

Gin Met Ala Gly Phe Cys Gin Ala Leu Trp Gin Leu Ala Leu He He 
755 760 765 

Arg Ser Thr Gly Leu Ser Thr Arg Glu Leu Thr Leu Leu Val Ser Gin 
770 775 780 

Pro Gly Arg Phe Arg Thr Gly Trp His His Leu Pro His Asp Leu Pro 
785 790 795 800 

Ala Leu Arg Asp He Thr Arg Phe His Ala Val Val Asn Arg Ser Gly 
805 810 815 
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Ser His Ala Gly Glu Val Leu Thr Ala Leu Glu Thr Gly Glu Leu Ser 
820 825 830 

Ser Ala Leu Leu Ala Arg Ala Leu Ser Gin Asn Glu Gin Asp Val Thr 
835 840 845 

Gly Ala Leu Ala Gin Val Arg Gly Ala Gly Glu Gin Asp Asn Ser Val 
850 855 860 



Phe Thr Ser Trp Glu Glu Val Asp Gin Ala Glu Gin Trp Leu Asp Met 

865 870 875 880 

Ser Glu Thr Leu Ser He Thr Pro Ser Gly Leu Ala Ser Leu He Ala 

885 890 895 



Leu Lys Tyr He Asn Val Ser Asp 
900 

Trp Gin Val Val Ser Gly Leu Leu 
915 920 

Ser Ser Ala Leu His Asp Tyr Leu 
930 935 

Cys Ala Tyr Tyr Leu Arg Asn Leu 
945 950 



Asp Ser Ala Pro Leu Tyr Ser Gin 
905 910 

Gin Ala Gly Leu Lys Ser Ser Gin 
925 

Glu Glu Gly Thr Ser Ser Ala Leu 
940 

Ala Pro Asn Met Val Ser Gly Arg 
955 960 



Asp Asp Leu Phe Gly Tyr Leu Leu Leu Asp Asn Gin Val Ser Ala Lys 
965 970 975 

Val Lys Thr Thr Arg He Ala Glu Ala lie Ala Gly lie Arg Leu Tyr 
980 985 990 

He Asn Arg Ala Leu Asn Gly He Glu Leu Ser Ala Met Ala Glu Val 
995 1000 1005 



Arg Gly Arg Gin Phe Phe Thr Asp Trp Asp Thr Phe Asn Lys Arg Tyr 
1010 1015 1020 

Ser Thr Trp Ala Gly Val Ser Glu Leu Val Tyr Tyr Pro Glu Asn Tyr 
025 1030 1035 1040 

Leu Asp Pro Thr Val Arg He Gly Gin Thr Gly Met Met Asp Thr Leu 
1045 1050 1055 

Leu Gin Ser Val Ser Gin Ser Ser He Asn Arg Asp Thr Val Glu Asp 
1060 1065 1070 

Ala Phe Lys Thr Tyr Leu Thr Thr Phe Glu Gin He Ala Asn Leu Asn 
1075 1080 1085 



Thr Val Ser Gly Tyr His Asp Asn Ala Ser Met Thr Gin Gly Thr Thr 
1090 1095 1100 

Trp Tyr Val Gly Arg Ser He Thr Asp Gin Thr Asn Trp Tyr Trp Arg 
105 1110 1115 1120 
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Ser Ala Asn His Ser Lys He Gin Asp Ser Met Met Pro Ala Asn Ala 
1125 1130 1135 

Trp Thr Gly Trp Thr Lys He Asn Cys Gly Met Asn Pro Trp Ser Asp 
1140 1145 1150 

Leu Val Cys Ser Val Phe Phe Asn Ser Arg Leu Tyr Val Val Trp Val 
1155 1160 1165 

Glu Glu Asn Gin Ser Ala Asp Thr Glu Ala Glu Ser Thr Thr Thr Thr 
1170 1175 1180 

Gin Gin Ser Tyr Thr Leu Lys Leu Ser Phe Arg Arg Tyr Asp Gly Thr 
185 1190 1195 1200 

Trp Ser Ser Pro Val Ser Phe Asp lie Thr Gly Asn lie Ala Phe Pro 
1205 1210 1215 

Glu Thr Gin Gly Met His Val Thr Cys Asn Pro Leu Thr Glu Gin Leu 
1220 1225 1230 

Tyr Cys Ala Phe Tyr Ser Val Thr Ser Lys Pro Asp Phe Asp Asn Ala 
1235 1240 1245 

Gin Leu lie Ser Val Asp Asn Asp Met Thr Leu Asn Val lie Ser Asp 
1250 1255 1260 

lie Gly lie Phe Lys Ser Val Ser His Glu Phe Asn Thr Ser Thr Glu 
265 1270 1275 1280 

Lys Phe He Asn Asn Val Phe Ser Asp Pro Ser Ala Asn Tyr Phe Val 
1285 1290 1295 

Ser Ala Thr Ser Leu He Asp Asp Val He His Ser Asp Phe Ser Leu 
1300 1305 1310 

Leu Asn Ser Lys Thr Thr Ser Thr Val Phe Thr Asn Glu Asp Ser Ser 
1315 1320 1325 

Leu Leu Thr Pro Glu Leu His He Thr Ala Asn Val Ser Cys Phe Val 
1330 1335 1340 

Ser Thr Ala Gly He Ala Thr Gin Ser Thr He Glu Lys Phe Val Gin 
345 1350 1355 1360 

Ala Gly He Glu Phe Glu Glu He Asn Phe Tyr Ala Gly Gin Ala Ala 
1365 1370 1375 

Gly Gly Phe Asp Gly Phe Val Gly Val Asp Val Ser Asn Ser Lys Val 
1380 1385 1390 

v 

Tyr Gin Val Gly Lys Glu Ala Val Gly Val Thr Val Lys Ser Tyr Ser 
1395 1400 1405 

Val Thr Gly Val Ser Gly Ser Val Glu Leu Phe He Asp Ser Ser Asn 
1410 1415 1420 



Lys Tyr Phe Ser Gly He Leu Ser Asp Lys 



Met He Thr Ala Leu He 
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425 1430 1435 1440 

Ser Gly Ser Thr Ser Lys Val Asn Tyr Val Ser Ser He Gly Ser Gin 
1445 1450 1455 

Asp Phe Trp Ser Val Lys Ser Leu Met Pro Ala Leu Gin He Tyr Glu 
1460 1465 1470 

Leu He Asp Asp lie He Leu Thr Ser Gly Val Asn Gly Thr Glu He 
1475 1480 1485 

Lys Ser Trp Pro Ser Ala Glu Trp Tyr Asn Asp Lys Leu Ser Leu Gin 
1490 1495 1500 

Ser Gly Asn Asn Leu Phe Asn Thr Lys Ser Leu Ser Phe Thr Val Asn 
505 1510 1515 1520 

Thr Ser Asp He Val Glu Asp Glu Phe Asp Val Thr Phe Thr Phe Thr 
1525 1530 1535 

Ala Val Asp Gin Asn Asn Val Val Leu Ala Ala Arg Thr Ala He Leu 
1540 1545 1550 

Thr Val He Arg Asn He Asn Asn Asp Thr Ser Val He Ala Leu Arg 
1555 1560 1565 

Lys Asn Thr Arg Gly Ala Gin Tyr He Arg Phe Thr Ala Gly Asn Asp 
1570 1575 1580 

Val Ala Leu He Arg Leu Asn Thr Leu Phe Ala Arg Gin Leu Val Asp 
585 1590 1595 1600 

Arg Al<\ Asn Thr Gly He Asp Thr He Leu Ser Met Glu Thr Gin Arg 
1605 1610 1615 

Leu Thr Glu Pro Ala Leu Glu Glu Gly Ser Asp Val Phe Met Asp Phe 
1620 1625 1630 

Ser Gly Ala Asn Ala Leu Tyr Phe Trp Glu Leu Phe Tyr Tyr Thr Pro 
1635 1640 1645 

Met Met Val Phe Gin Arg Leu Leu Gin Glu Gin His Phe Pro Glu Ala 
1650 1655 1660 

Thr Arg Trp Leu Gin Tyr Val Trp Asn Pro Ala Gly His Val Val Asn 
665 1670 1675 1680 

Gly Val Leu Gin Asn Tyr Thr Trp Asn Val Arg Pro Leu Glu Glu Asp 
1685 1690 1695 

Thr Gly Trp Asn Asp Ser Pro Leu Asp Ser He Asp Pro Asp Ala He 
1700 1705 1710 

Ala Gin Tyr Asp Pro Met His Tyr Lys Val Ala Thr Phe Met Ser Tyr 
1715 1720 1725 

Leu Asp Leu Leu He Ala Arg Gly Asp Ala Ala Tyr Arg Leu Leu Glu 
1730 1735 1740 
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Arg Asp Thr Leu Asn Glu Ala Arg Met Trp Tyr Val Gin Ala Leu Asn 
745 1750 1755 1760 

Leu Leu Gly Asp Glu Pro Tyr lie Ser Phe Asp Ala Asp Trp Ser Ala 
1765 1770 1775 

Leu Thr Leu Gly Asp Ala Ala Ser Glu Val Thr Arg Arg Asp Tyr Gin 
1780 1785 1790 

Glu Ala Leu Leu Ala Val Arg Arg Leu Val Pro Ala Pro Glu Thr Arg 
1795 1800 1805 

Thr Ala Asn Ser Leu Thr Ala Leu Phe Leu Pro Gin Gin Asn Glu Val 

1810 1815 1820 
v 

Leu Lys Gly Tyr Trp Gin Thr Leu Ala Gin Arg Leu His Asn Leu Arg 

825 1830 1835 1840 

His Asn Leu Ser lie Asp Gly Gin Pro Leu Ser Leu Ser Val Tyr Ala 
1845 1850 1855 

Thr Pro Ser Glu Pro Ser Ala Leu Gin Ser Ala Val Val Asn Ser Ala 
1860 1865 1870 

Gin Gly Ala Ala Ala Leu Pro Ala Ala Val Met Pro Leu Tyr Ser Phe 
1875 1880 1885 

Pro Val Met Leu Glu Asn Ala Arg Gly Met Val Ser Leu Leu Thr Gly 
1890 1895 1900 

Phe Gly Asn Thr Leu Leu Gly lie Thr Glu Arg Gin Asp Ala Glu Ala 
905 1910 1915 1920 

Leu Ala Lys Leu Leu Gin Thr Gin Gly Ser Glu Leu lie Arg Gin Gly 
1925 1930 1935 

Leu Arg Gin Gin Asp Asn Val Leu Glu Glu He Asp Ala Asp lie Ala 
1940 1945 1950 

Ala Leu v Glu Glu Ser Arg Arg Gly Ala Gin Met Arg Phe Glu Arg Tyr 
1955 1960 1965 

Lys Val Leu Tyr Glu Ala Asp Val Asn Thr Gly Glu Lys Gin Ala Met 
1970 1975 1980 

Asp Leu Tyr Leu Ser Ser Ser Val Leu Ser Ala Ser Thr Ala Ala Leu 
985 1990 1995 2000 

Phe Leu Ala Glu Ala Ala Ala Asp Met Leu Pro Asn He Tyr Gly Leu 
2005 2010 2015 

Ala Val Gly Gly Ser Arg Tyr Gly Ala Leu Phe Lys Ala Thr Ala He 
2020 2025 2030 

Gly He Gin Val Ser Ser Asp Ala Thr Arg He Ser Ala Asp Lys He 
2035 2040 2045 
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Ser Gin Ser Glu Val Tyr Arg Arg Arg Arg Glu Glu Trp Glu He . Gin 
2050 2055 2060 

Arg Asp Ser Ala Gin Ser Asp Val Ala Gin He Asp Ala Gin Leu Ala 
065 2070 2075 2080 

Ala Met Ala Val Arg Arg Glu Gly Ala Glu Leu Gin Lys Thr Tyr Leu 
2085 2090 2095 

Glu Thr Gin Gin Thr Gin Ala Gin Ala Gin Leu Ala Phe Leu Gin Ser 
2100 2105 2110 

Lys Phe Asn Asn Thr Ala Leu Tyr Ser Trp Leu Arg Gly Arg Leu Ser 
2115 2120 2125 

Ala He Tyr Tyr Gin Phe Tyr Asp Leu Ala Val Ser Arg Cys Leu Met 
2130 2135 2140 

Ala Gin Gin Ala Trp Gin Trp Asp Lys Phe Glu Thr Arg Ser Phe He 
145 2150 2155 2160 

Gin Pro Gly Ala Trp Met Gly Ala Asn Ala Gly Leu Leu Ala Gly Glu 
2165 2170 2175 

v 

Thr Leu Met Leu Asn Leu Ala Gin Met Glu Gin Ala Trp Leu Thr Gly 
2180 2185 2190 

Asp Glu Arg Ala He Glu Val Thr Arg Thr Val Cys Leu Ser Glu Val 
2195 2200 2205 

Tyr Thr Ser Leu Ala Glu Asp Ala Ala Phe Ser Leu Ala Asp Lys Val 
2210 2215 2220 

Val Glu Leu Val Ser Asn Gly Ser Gly Ser Ala Gly Thr Lys Ser Asn 
225 2230 2235 2240 

Gly Leu Gin Met Asp Gin Gin Gin Leu Glu Ala Thr Leu Lys Leu Ala 
2245 2250 2255 

Asp Leu Gly He Gly Asn Asp Tyr Pro Val Ser Leu Gly Thr Met Arg 
2260 2265 2270 

Arg He Lys Gin He Ser Val Thr Leu Pro Ala Leu Val Gly Pro Tyr 
2275 2280 2285 

Gin Asp Val Arg Ala Val Leu Ser Tyr Gly Gly Ser Met Val Met Pro 
2290 2295 2300 

Arg Giy Cys Ser Ala Leu Ala Val Ser His Gly Met Asn Asp Ser Gly 
305 2310 2315 2320 

Gin Phe Gin Leu Asp Phe Asn Asp Pro Arg Tyr Leu Pro Phe Glu Gly 
2325 2330 2335 

Leu Pro Val Asp Asp Thr Gly Thr Leu Thr Leu Ser Phe Pro Asp Ala 
2340 2345 2350 

Asp Gly Lys Gin Gin Ala Met Leu Leu Ser Leu Ser Asp lie He Leu 
2355 2360 2365 



Hie He Arg Tyr Thr He He Ser 
2370 2375 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1429 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: PROTEIN (SepB) 

(XX) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



Met Gin Asn His Gin Asp Met Ala 
1 5 

Gly Gly Gly Ala Val Thr Gly Leu 
20 

Pro Asj* Gly Ala Ala Thr Leu Ser 
3 5 40 

Arg Gly Tyr Ala Pro Thr Gly Ala 
50 55 



lie Thr Ala Pro Thr Leu Pro Ser 
10 15 

Lys Gly Asp lie Ala Ala Ala Gly 
25 30 

lie Pro Leu Pro Val Ser Pro Gly 

. 45 

Leu Asn Tyr His Ser Arg Ser Gly 
60 
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Asn Gly Pro Phe Gly He Gly Trp Gly He Gly Gly Ala Ala Val Gin 
65 70 75 80 

Arg Arg Thr Arg Asn Gly Ala Pro Thr Tyr Asp Asp Thr Asp Glu Phe 
85 90 95 

Thr Gly Pro Asp Gly Glu Val Leu Val Pro Ala Leu Thr Ala Ala Gly 
100 105 110 

Thr Gin Glu. Ala Arg Gin Ala Thr Ser Leu Leu Gly He Asn Pro Gly 
115 12 0 125 

Gly Ser *Phe Asn Val Gin Val Tyr Arg Ser Arg Thr Glu Gly Ser Leu 
130 135 140 

Ser Arg Leu Glu Arg Trp Leu Pro Ala Asp Glu Thr Glu Thr Glu Phe 
145 150 155 160 

Trp Val Leu Tyr Thr Pro Asp Gly Gin Val Ala Leu Leu Gly Arg Asn 
165 170 175 

Ala Gin Ala Arg He Ser Asn Pro Thr Ala Pro Thr Gin Thr Ala Val 
180 185 190 

Trp Leu Met Glu Ser Ser Val Ser Leu Thr Gly Glu Gin Met Tyr Tyr 
195 200 205 

Gin Tyr Arg Ala Glu Asp Asp Asp Gly Cys Asp Glu Ala Glu Arg Asp 
210 215 220 

Ala His Pro Gin Ala Gly Ala Gin Arg Tyr Pro Val Ala Val Trp Tyr 
225 230 235 240 

Gly Asn Arg Gin Ala Ala Arg Thr Leu Pro Ala Leu Val Ser Thr Pro 
245 250 255 

Ser Met Asp Ser Trp Leu Phe He Leu Val Phe Asp Tyr Gly Glu Arg 
* 260 265 270 

Ser Ser Val Leu Ser Glu Ala Pro Ala Trp Gin Thr Pro Gly Ser Gly 
275 280 285 

Glu Trp Leu Cys Arg Gin Asp Cys Phe Ser Gly Tyr Glu Phe Gly Phe 
290 295 300 

Asn Leu Arg Thr Arg Arg Leu Cys Arg Gin Val Leu Met Phe His Tyr 
305 310 315 320 

Leu Gly Val Leu Ala Gly Ser Ser Gly Ala Asn Asp Ala Pro Ala Leu 
325 330 335 

He Ser Arg Leu Leu Leu Asp Tyr Arg Glu Ser Pro Ser Leu Ser Leu 
340 345 350 



Leu Glu .Asn Val His Gin Val Ala Tyr Glu Ser Asp Gly Thr Ser Cys 
355 360 365 
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Ala Leu Pro Ala Leu Ala Leu Gly Trp Gin Thr Phe Thr Pro Pro Thr 

370 375 380 

Leu Ser Ala Trp Gin Thr Arg Asp Asp Met Gly Lys Leu Ser Leu Leu 
385 390 395 400 

Gin Pro Tyr Gin Leu Val Asp Leu Asn Gly Glu Gly Val Val Gly lie 
405 410 415 



Leu Tyr Gin Asp Ser Gly Ala Trp Trp Tyr Arg Glu Pro Val Arg Gin 
420 425 430 

Ser Gly Asp Asp Pro Asp Ala Val Thr Trp Gly Ala Ala Ala Ala Leu 
435 440 445 

Pro Thr Met Pro Ala Leu His Asn Ser Gly lie Leu Ala Asp Leu Asn 
450 455 460 

Gly Asp Gly Arg Leu Glu Trp Val Val Thr Ala Pro Gly Val Ala Gly 
465 470 475 480 

Met Tyr ksp Arg Thr Pro Gly Arg Asp Trp Leu His Phe Thr Pro Leu 
485 490 495 

Ser Ala Leu Pro Val Glu Tyr Ala His Pro Lys Ala Val Leu Ala Asp 
500 505 510 

lie Leu Gly Ala Gly Leu Thr Asp Met Val Leu lie Gly Pro Arg Ser 
515 520 525 

Val Arg Leu Tyr Ser Gly Lys Asn Asp Gly Trp Asn Lys Gly Glu Thr 
530 535 540 

Val Gin Gin Thr Glu Arg Leu Thr Leu Pro Val Pro Gly Val Asp Pro 
545 550 555 560 

Arg Thr Leu Val Ala Phe Ser Asp Met Ala Gly Ser Gly Gin Gin His 
565 570 575 

Leu Thr Glu Val Arg Ala Asn Gly Val Arg Tyr Trp Pro Asn Leu Gly 
580 585 590 

His Gly Arg Phe Gly Gin Pro Val Asn lie Pro Gly Phe Ser Gin Ser 
595 600 605 

Val Thr Thr Phe Asn Pro Asp Gin lie Leu Leu Ala Asp Thr Asp Gly 
610 * 615 620 

Ser Gly Thr Thr Asp Leu lie Tyr Ala Met Ser Asp Arg Leu Val lie 
625 630 635 640 

Tyr Phe Asn Gin Ser Gly Asn Tyr Phe Ala Glu Pro His Thr Leu Leu 
645 650 655 

Leu Pro Lys Gly Val Arg Tyr Asp Arg Thr Gys Ser Leu Gin Val Ala 
660 665 670 



Asp lie Gin Gly Leu Gly Val Pro Ser Leu Leu Leu Thr Val Pro His 
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675 680 685 

Val Ala Pro His His Trp Val Cys His Leu Ser Ala Asp Lys Pro Trp 

690 695 700 

v 

Leu Leu Asn Gly Met Asn Asn Asn Met Gly Ala Arg His Ala Leu His 
705 710 715 720 

Tyr Arg Ser Ser Val Gin Phe Trp Leu Asp Glu Lys Ala Glu Ala Leu 
725 730 735 

Ala Ala Gly Ser Ser Pro Ala Cys Tyr Leu Pro Phe Thr Leu His Thr 
740 745 750 

Leu Trp Arg Ser Val Val Gin Asp Glu lie Thr Gly Asn Arg Leu Val 
755 760 765 

Ser Asp Val Leu Tyr Arg His Gly Val Trp Asp Gly Gin Glu Arg Glu 
770 775 780 

Phe Arg Gly Phe Gly Phe Val Glu lie Arg Asp Thr Asp Thr Leu Ala 
785 790 795 800 

Ser Gin Gly Thr Ala Thr Glu Leu Ser Met Pro Ser Val Ser Arg Asn 
805 810 815 

Trp Tyr Ala Thr Gly Val Pro Ala Val Asp Glu Arg Leu Pro Glu Thr 
820 825 830 

Tyr Trp Gin Asn Asp Ala Ala Ala Phe Ala Asp Phe Ala Thr Arg Phe 
835 840 845 

Thr Val Gly Ser Gly Glu Asp Glu Gin Thr Tyr Thr Pro Asp Asp Ser 
850 855 860 

Lys Thr Phe Trp Leu Gin Arg Ala Leu Lys Gly lie Leu Leu Arg Ser 
865 870 875 880 

Glu Leu Tyr Gly Ala Asp Gly Ser Ser Gin Ala Asp lie Pro Tyr Ser 
885 890 895 

Val Thr Glu Ser Arg Pro Gin Val Arg Leu Val Glu Ala Asn Gly Asp 
900 905 910 

Tyr Pro Val Val Trp Pro Met Gly Ala Glu Ser Arg Thr Ser Val Tyr 
915 920 925 

Glu Arg Tyr His Asn Asp Pro Gin Cys Gin Gin Gin Ala Val Leu Leu 
930 935 940 

Ser Asp Glu Tyr Gly Phe Pro Leu Arg Gin Val Ser Val Asn Tyr Pro 
945 950 955 960 

Arg Arg Pro Pro Ser Ala Asp Asn Pro Tyr Pro Ala Ser Leu Pro Ala 
* 965 970 975 

Thr Leu Phe Ala Asn Ser Tyr Asp Glu Gin Gin Gin lie Leu Arg Leu 
980 985 990 
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Gly Leu Gin Gin Ser Ser Ala His His Leu Val Ser Leu Ser Glu Gly 
995 1000 1005 

His Trp Leu Leu Gly Leu Ala Glu Ala Ser Arg Asp Asp Val Phe Thr 
1010 1015 1020 

Tyr Ser Ala Asp Asn Val Pro Glu Gly Gly Leu Thr Leu Glu His Leu 
025 1030 1035 1040 

Leu Ala Pro Glu Ser Leu Val Ser Asp Ser Gin Val Gly Thr Leu Ala 
1045 1050 1055 

Gly Gin Gin Gin Val Trp Tyr Leu Asp Ser Gin Asp Val Ala Thr Val 
1060 1065 1070 

Ala Ala Pro Pro Leu Pro Pro Lys Val Ala Phe lie Glu Thr Ala Val 
1075 1060 1085 



Leu Asp Glu Gly Met Val Ser Ser Leu Ala Ala Tyr He Val Asp Glu 
1090 1095 1100 

His Leu Glu Gin Ala Gly Tyr Arg Gin Ser Gly Tyr Leu Phe Pro Arg 
105 1110 1115 1120 

Gly Arg Glu Ala Glu Gin Ala Leu Trp Thr Gin Cys Gin Gly Tyr Val 
1125 1130 1135 

Thr Tyr Ala Gly Ala Glu His Phe Trp Leu Pro Leu Ser Phe Arg Asp 
1140 1145 1150 

Ser Met Leu Thr Gly Pro Val Thr Val Thr Arg Asp Ala Tyr Asp Cys 
1155 1160 1165 

Val He Thr Gin Trp Gin Asp Ala Ala Gly He Val Thr Thr Ala Asp 
1170 1175 1180 

Tyr Adp Trp Arg Phe Leu Thr Pro Val Arg Val Thr Asp Pro Asn Asp 
185 1190 1195 1200 

Asn Leu Gin Ser Val Thr Leu Asp Ala Leu Gly Arg Val Thr Thr Leu 
1205 1210 1215 

Arg Phe Trp Gly Thr Glu Asn Gly He Ala Thr Gly Tyr Ser Asp Ala 
1220 1225 1230 

Thr Leu Ser Val Pro Asp Gly Ala Ala Ala Ala Leu Ala Leu Thr Ala 
1235 1240 1245 



Pro Leu Pro Val Ala Gin Cys Leu 
1250 1255 

Asp Asp Asp Asn Glu Lys Met Pro 
265 1270 

Asp Arg Tyr Asp Ser Asp Thr Gly 
1285 



Val Tyr Val Thr Asp Ser Trp Gly 
1260 

Pro His Val Val Val Leu Ala Thr 
1275 1280 

Gin Gin Val Arg Gin Gin Val Thr 
1290 1295 
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Phe Ser Asp Gly Phe Gly Arg Glu Leu Gin Ser Ala Thr Arg Gin Ala 
1300 1305 1310 

Glu Gly Asn Ala Trp Gin Arg Gly Arg Asp Gly Lys Leu Val Thr Ala 
1315 1320 1325 

Ser Asp Gly Leu Pro Val Thr Val Ala Thr Asn Phe Arg Trp Ala Val 
1330 1335 1340 

Thr Gly Arg Ala Glu Tyr Asp Asn Lys Gly Leu Pro Val Arg Val Tyr 
345 1350 1355 1360 

Gin Pro Tyr Phe Leu Asp Ser Trp Gin Tyr Val Ser Asp Asp Ser Ala 
1365 1370 1375 

Arg Gin Asp Leu Tyr Ala Asp Thr His Phe Tyr Asp Pro Thr Ala Arg 
1380 1385 1390 

Glu Trp Gin Val lie Thr Ala Lys Gly Glu Arg Arg Gin Val Leu Tyr 
1395 1400 1405 

Thr Pro Trp Phe Val Val Ser Glu Asp Glu Asn Asp Thr Val Gly Leu 
1410 1415 1420 



Asn Asp Ala Ser 
425 
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(2) INFORMATION FOR SEQ ID NO : 6: 

ti) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 973 amino acid residues 

(B) TYPE: amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: PROTEIN (SepC) 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



Met Ser Thr Ser Leu Phe Ser Ser Thr Pro Ser Val Ala Val Leu Asp 
15 10 15 

Asn Arg Gly Leu Leu Val Arg Glu Leu Gin Tyr Tyr Arg His Pro Asp 
20 25 30 

Thr Pro Glu Glu Thr Asp Glu Arg lie Thr Cys His Gin His Asp Glu 
35 40 45 

Arg Gly Ser Leu Ser Gin Ser Ala Asp Pro Arg Leu His Ala Ala Gly 
50 55 60 

Leu Thr Asn Phe Thr Tyr Leu Asn Ser Leu Thr Gly Thr Val Leu Gin 
65 70 75 80 

v 

Ser Val Ser Ala Asp Ala Gly Thr Ser Leu Glu Leu Ser Asp Ala Ala 
85 90 95 

Gly Arg Ala Phe Leu Ala Val Thr Gly Ala Gly Thr Glu Asp Ala Val 
100 105 110 

Thr Arg Thr Trp Gin Tyr Glu Asp Asp Thr Leu Pro Gly Arg Pro Leu 
115 120 125 

Ser lie Thr Glu Gin Val Thr Gly Glu Ala Ala Gin lie Thr Glu Arg 
130 135 140 

Phe Val Tyr Ala Gly Asn Thr Asp Ala Glu Lys lie Leu Asn Leu Ala 
145 150 155 160 

Gly Gin Cys Val Ser His Tyr Asp Thr Ala Gly Leu Val Gin Thr Asp 
165 170 175 

Ser He Ala Leu Ser Gly Val Pro Leu Ala Val Thr Arg Gin Leu Leu 
180 185 190 

Pro Asp Ala Ala Gly Ala Asn Trp Met Gly Glu Asp Ala Ser Ala Trp 
195 200 205 

Asn Asp Leu Leu Asp Gly Glu Thr Phe Phe Thr Gin Thr His Ala Asp 
2^0 215 220 



Ala Thr Gly Ala Val Leu Ser He Thr Asp Ala Lys Gly Asn Leu Gin 
225 230 235 240 
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Arg Val Ala Tyr Asp Val Ala Gly Leu beu Ser Gly Ser Trp Leu Thr 
245 250 255 

Leu Lys Asp Gly Thr Glu Gin Val lie Val Ala Ser Leu Thr Tyr Ser 
260 265 270 

Ala Ala Gly Lys Lys Leu Arg Glu Glu His Gly Asn Gly Val Val Thr 
275 280 285 

Ser Tyr lie Tyr Glu Pro Glu Thr Gin Arg Leu Thr Gly He Lys Thr 
29-0 295 300 

Glu Arg Pro Ser Gly His Val Ala Gly Ala Lys Val Leu Gin Asp Leu 
— ■ 305 310 315 320 

Arg Tyr Thr Tyr Asp Pro Val Gly Asn Val Leu Ser Val Asn Asn Asp 
325 330 335 

Ala Glu Glu Thr Arg Phe Trp Arg Asn Gin Lys Val Val Pro Glu Asn 
340 345 350 

Thr Tyr lie Tyr Asp Ser Leu Tyr Gin Leu Val Ser Ala Thr Gly Arg 
355 360 365 

Glu Met Ala Asn Ala Gly Gin Gin Gly Asn Asp Leu Pro Ser Ala Thr 
370 375 380 

Ala Pro Leu Pro Thr Asp Ser Ser Ala Tyr Thr Asn Tyr Thr Arg Thr 
385 390 395 400 

Tyr Arg Tyr Asp Arg Gly Gly Asn Leu Thr Gin Met Arg His Ser Ala 
405 410 415 

Pro Ala Thr Asn Asn Asn Tyr Thr Thr Asp He Thr Val Ser Asp Arg 
420 425 430 

Ser Asn Arg Ala Val Leu Ser Thr Leu Ala Glu Val Pro Ser Asp Val 
435 440 445 

Asp Met Leu Phe Ser Ala Gly Gly His Gin Lys His Leu Gin Pro Gly 
450 455 460 

Gin Ala Leu Val Trp Thr Pro Arg Gly Glu Leu Gin Lys Val Thr Pro 
465 470 475 480 

val Val Arg Asp Gly Gly Ala Asp Asp Ser Glu Ser Tyr Arg Tyr Asp 
485 490 49S 

Ala Gly Ser Gin Arg He lie Lys Thr Gly Thr Arg Gin Thr Gly Asn 
500 505 510 

Asn Val Gin Thr Gin Arg Val Val Tyr Leu Pro Gly Leu Glu Leu Arg 
.515 520 525 

He Met Ala Asn Gly Val Thr Glu Lys Glu Ser Leu Gin Val He Thr 
530 535 540 



Val Gly Glu Ala Gly Arg Ala Gin Val Arg Val Leu His Trp Glu He 
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545 550 555 560 

Gly Lys Pro Asp Asp Leu Asp Glu Asp Ser Val Arg Tyr Ser Tyr Asp 
565 570 575 

Asn Leu Val Gly Ser Ser Gin Leu Glu Leu Asp Arg Glu Gly Tyr Leu 
580 585 590 

He Ser Glu Glu Glu Phe Tyr Pro Tyr Gly Gly Thr Ala Val Leu Thr 
595 600 605 

Ala Arg Ser Glu Val Glu Ala Asp Tyr Lys Thr He Arg Tyr Ser Gly 
610 615 620 

Lys Glu Arg Asp Ala Thr Gly Leu Asp Tyr Tyr Gly Tyr Arg Tyr Tyr 
625 630 635 640 

Gin Pro Trp Ala Gly Arg Trp Leu Ser Thr Asp Pro Ala Gly Thr Val 
v 645 650 655 

Asp Gly Leu Asn Leu Phe Arg Met Val Arg Asn Asn Pro Val Thr Leu 
660 665 670 

Phe Asp Ser Asn Gly Arg He Ser Thr Gly Gin Glu Ala Arg Arg Leu 
675 680 685 

Val Gly Glu Ala Phe Val His Pro Leu His Met Pro Val Phe Glu Arg 
690 695 700 

He Ser Val Glu Arg Lys lie Ser Met Ser Val Arg Glu Ala Gly He 
705 710 715 720 

Tyr Thr He Ser Ala Leu Gly Glu Gly Ala Ala Ala Lys Gly His Asn 
725 730 735 

He Leu Glu Lys Thr He Lys Pro Gly Ser Leu Lys Ala He Tyr Gly 
740 745 750 

Asp Lys Ala Glu Ser He Leu Gly Leu Ala Lys Arg Ser Gly Leu Val 
755 760 765 

Gly Arg Val Gly Gin Trp Asp Ala Ser Gly Val Arg Gly He Tyr Ala 
770 775 780 

His Asn Arg Pro Gly Gly Glu Asp Leu Val Tyr Pro Val Ser Leu Gin 
785 790 795 800 

Asn Thr Ser Ala Asn Glu He Val Asn Ala Trp He Lys Phe Lys He 
805 810 815 

He Thr Pro Tyr Thr Gly Asp Tyr Asp Met His Asp He He Lys Phe 
820 825 830 

Ser Asp Gly Lys Gly His Val Pro Thr Ala Glu Ser Ser Glu Glu Arg 
835 840 845 

Gly Val Lys Asp Leu He Asn Lys Gly Val Ala Glu Val Asp Pro Ser 
850 855 860 
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Arg Pro Phe Glu Tyr Thr Ala Met 
865 870 

Val Asn Phe Val Pro Tyr Met Trp 
885 

Asn A\sp Asn Gly Tyr Leu Gly Val 
900 

Val Ala Met Val His Gin Gly Glu 
915 920 

Glu Leu Phe Asn Phe Tyr Lys Ser 
930 935 

Trp Ser Gin Asp Phe Met Asp Arg 
945 950 

Arg His Ala Glu Leu Leu Asp Lye 

965 



Asn Val lie Arg His Gly Pro Gin 
875 880 

Glu His Glu His Asp Lys Val Val 
890 895 

Val Ala Ser Pro Gly Pro Phe Pro 
905 910 

Trp Thr Val Phe Asp Asn Ser Glu 
325 

Thr Asn Thr Pro Leu Pro Glu His 
940 

Gly Lys Gly lie Val Ala Thr Pro 
95S 960 

Arg Arg Val Met Tyr 
970 
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